Robust and E cient One-way MANOVA Tests - Supplemental ... · MANOVA test statistic, and the MCD test statistic. The simulation setting is the same as in Section 6 of the manuscript.

Robust and Efficient One-way MANOVA Tests -

Supplemental Material

Stefan Van Aelst and Gert Willems

Department of Applied Mathematics and Computer Science

Ghent University, Krijgslaan 281 S9, B-9000 Ghent, Belgium.

Abstract

In this extra report we present some supplemental results related to the robust MANOVA

tests, introduced in the accompanying manuscript. We show Fisher-consistency of the k-

sample S and MM-estimators at unimodal elliptical distributions. We provide simulation

results to verify the performance of the bootstrap based tests in finite samples. Finally,

we illustrate the use of these robust test statistics on a second real data example. The

Appendix also contains the proofs of the theorems in the manuscript.

Keywords: MM-estimators, Fast and Robust Bootstrap, Wilks’ Lambda, Outliers.

1 Introduction

In the accompanying manuscript we introduced the robust MANOVA test statistics SΛ1F ,

SΛ2aF , and SΛ2b

F based on one-sample and k-sample S-estimators. See respectively expressions

(23), (7) and (8) of the manuscript. Based on MM-estimators we introduced the related test

statistics MMΛaF and MMΛb

F , given by expressions (11) and (12) in the manuscript. As in

the manuscript, we replace the subscript F by n in the finite-sample case. To estimate the p-

value of the robust tests based on these test statistics, we introduced a fast and robust (FRB)

bootstrap procedure.

In the next section, we provide a proposition that shows Fisher-consistency of the k-sample

S and MM-estimators at unimodal elliptical distributions under fairly weak conditions on the

loss functions. Section 3 contains more results on the finite-sample accuracy of the FRB based

robust tests. In Section 4 we provide further simulation results regarding the robustness of the

null distribution of the test statistics, irrespective of how the p-value is estimated. Section 5

first investigates the power of the tests by examining to what extent the distribution of the

1

test statistics under the alternative differs from the null distribution. We then investigate the

effect of contamination on the power of the robust tests when the p-value is estimated by FRB.

Section 6 contains a second real data example where we illustrate the use of the MM-based

tests for both multivariate and univariate ANOVA. The Appendix contains the proof of the

Fisher-consistency and the proofs of the theorems in the manuscript.

2 Fisher-consistency

In this section we investigate Fisher-consistency of the k-sample S and MM-estimators when the

groups follow a unimodal elliptical distribution with possibly different centers, i.e. Fj = Fµj ,Σ.

The proof of this proposition which is given in the Appendix, is similar to the Fisher-consistency

proof of general multivariate location M-estimators in Alqallaf, Van Aelst, Yohai and Zamar

(2009).

Proposition 1. Suppose that ρ0 and ρ1 are non-decreasing, continuous and strictly increasing

at zero. Moreover, suppose that the groups have elliptical distributions Fj = Fµj ,Σ (j = 1, . . . , k)

with density

fµj ,Σ(x) = f((x− µj)tΣ−1(x− µj)).

The scatter matrix Σ can be decomposed in a scale part σ and a shape part Γ, i.e. Σ = σ2Γ.

The function f is assumed to be non-increasing, continuous and strictly decreasing at zero,

such that the distributions Fj are unimodal elliptical distributions. Let b0 = EF0,I[ρ0(∥x∥)],

then, µ̂(k)j,F = µj (j = 1, . . . , k), Σ̂

(k)F = Σ, and µ̃

(k)j,F = µj (j = 1, . . . , k), Γ̃

(k)F = Γ That is, the

k-sample S and MM-estimators are Fisher-consistent.

Note that the conditions on the loss functions ρ0 and ρ1 in Proposition 1 are fairly weak and

are clearly satisfied for loss functions that satisfy conditions (R1) and (R2) of the manuscript.

3 Finite-sample accuracy of the FRB based tests

In this section we investigate further how closely the actual Type I error rates of our tests

match their nominal values in case of finite samples. As in Section 5 of the manuscript, we

generated data under the null hypothesis of equal means. In particular, the k samples were all

2

drawn from the multivariate standard normal distribution N(0, Ip), so µ1 = . . . = µk = 0. For

k = 2, 3 and 5 groups, we considered the dimension p and sample sizes nj combinations given

in Table 1. As in the manuscript, for each number of groups k, we show the simulation results

of the cases according to increasing ratio n/p, which expresses the relative sample size, taking

the dimension into account. The (id)-column in Table 1 gives the position of each case in the

corresponding plots. For each case again 1000 samples were generated and the tests used FRB

with B = 1000 bootstrap samples, yielding a p-value estimate for each test as defined in (24).

k = 2 k = 3 k = 5

p n1 n2 (id) p n1 n2 n3 (id) p n1 n2 n3 n4 n5 (id)

2 10 10 (2) 2 10 10 10 (2) 2 10 10 10 10 10 (2)

2 20 20 (7) 2 20 20 20 (8) 2 20 20 20 20 20 (8)

2 30 30 (9) 2 30 30 30 (10) 2 30 30 30 30 30 (10)

2 50 50 (12) 2 50 50 50 (13) 2 50 50 50 50 50 (13)

2 100 100 (14) 2 100 100 100 (15) 2 100 100 100 100 100 (14)

2 200 200 (15) 2 20 20 10 (5) 2 20 20 10 10 10 (5)

2 20 10 (5) 2 30 30 10 (9) 2 50 50 10 10 10 (9)

2 30 10 (8) 2 50 50 20 (12) 2 100 50 20 20 20 (12)

2 50 20 (11) 2 100 50 20 (14) 6 20 20 20 20 20 (1)

6 20 20 (1) 6 20 20 20 (1) 6 30 30 30 30 30 (3)

6 30 30 (3) 6 30 30 30 (3) 6 50 50 50 50 50 (7)

6 50 50 (6) 6 50 50 50 (6) 6 100 100 100 100 100 (11)

6 100 100 (10) 6 100 100 100 (11) 6 50 50 20 20 20 (4)

6 200 200 (13) 6 50 50 20 (4) 6 100 50 20 20 20 (6)

6 50 20 (4) 6 100 50 20 (7)

Table 1: Simulation settings under the null hypothesis to investigate the accuracy of the FRB

tests.

The curves in Figure 1 represent the observed type I error rate if the tests would be per-

formed on the 5% significance level. The top row in Figure 1 contains the results for k = 2,

the middle row for k = 3 and the bottom row for k = 5. The left plots consider tests based

3

on S-estimates, while the right plots consider MM-estimates. The results for other significance

levels were found to be similar.

0 5 10 150

0.05

0.1

0.15

0.2

case #0 5 10 15

0

0.05

0.1

0.15

0.2

case #

S1S2aS2b

MaMb

0 5 10 150

0.05

0.1

0.15

0.2

case #0 5 10 15

0

0.05

0.1

0.15

0.2

case #

S1S2aS2b

MaMb

0 5 10 150

0.05

0.1

0.15

0.2

case #0 5 10 15

0

0.05

0.1

0.15

0.2

case #

MaMb

S1S2aS2b

Figure 1: Observed FRB-based Type I-error rates for nominal level 0.05, various sample sizes

and dimensions; The top panel contains the results for k = 2 groups, the middle panel for k = 3

groups, and the bottom panel for k = 5 groups; In each plot the cases are ordered as indicated

in Table 1.

The plots in Figure 1 confirm the conclusions in the manuscript. That is, the FRB based

tests are more accurate for increasing relative sample size. Moreover, the FRB based tests are

4

more accurate for MM-estimates than for S-estimates. Especially the SΛ2bn test turns out to

be overly liberal. For the MM-based tests, the results for MMΛbn are slightly better than for

MMΛan. We conclude that the FRB is a sufficiently reliable method to obtain p-values for the

MM-based tests, even in small samples, while it requires larger sample sizes to obtain accurate

p-values for the S-based tests.

4 Robustness of the null distribution

Here, we investigate the effect of contamination on the (true) null distribution of the test

statistics, which is not dependent on the FRB or the p-value estimates in general. We consider

the same test statistics as in Section 6 of the manuscript. That is, the five test statistics

based on S/MM-estimators, the classical Wilks’ Lambda test statistic, the rank-transformed

MANOVA test statistic, and the MCD test statistic.

The simulation setting is the same as in Section 6 of the manuscript. In particular, groups

X1 to Xk−1 are generated according to N(0, Ip), and group Xk follows the contamination model

(1− ϵ)N(0, Ip) + ϵ N(µc, Ip)

where µc = d√χ2p,.999/p, and d = 2, 5 or 10. We consider the setting of k = 3, with sample

sizes nj = 20 or nj = 100 (j = 1, . . . , 3) and dimension p = 2 or p = 6. The outlier proportion

was fixed at ϵ = 0.10 and 1000 samples were generated for each case.

Figures 2 and 3 show probability-probability plots of the empirical distribution of the test

statistics in the contaminated data versus the distribution in the non-contaminated data. In

particular, withN = 1000, the i-th data point in the plots (i = 1, . . . , N) equals ( iN+1 , F̂cont(F̂

−1norm(

iN+1))),

where F̂cont and F̂norm denote the empirical distributions of the statistic in respectively the con-

taminated and the non-contaminated samples. Large deviations from the diagonal indicate an

adverse effect of the outliers.

Figure 2 shows the results for n1 = n2 = n3 = 20 and p = 2, where the outlier distance is

d = 2 in the upper row and d = 10 in the lower row. The plots on the left correspond to the

S-based test statistics, the middle plots to the MM-based test statistics, and the plots on the

right correspond to Wilks’ Lambda, its rank-transformed version and the MCD test statistic.

Figure 3 shows the results for the case n1 = n2 = n3 = 100 and p = 6. It can be seen that

5

0 0.5 10

0.2

0.4

0.6

0.8

1O

utlie

r D

ista

nce

= 2

0 0.5 10

0.2

0.4

0.6

0.8

1

0 0.5 10

0.2

0.4

0.6

0.8

1

0 0.5 10

0.2

0.4

0.6

0.8

1

Out

lier

Dis

tanc

e =

10

0 0.5 10

0.2

0.4

0.6

0.8

1

0 0.5 10

0.2

0.4

0.6

0.8

1

S1S2aS2b

MaMb

ClasRankMCD

Figure 2: PP-plots comparing the distribution of the test statistics under the null hypothesis

in contaminated versus non-contaminated samples, with n1 = n2 = n3 = 20, p = 2.

0 0.5 10

0.2

0.4

0.6

0.8

1

Out

lier

Dis

tanc

e =

2

0 0.5 10

0.2

0.4

0.6

0.8

1

0 0.5 10

0.2

0.4

0.6

0.8

1

0 0.5 10

0.2

0.4

0.6

0.8

1

Out

lier

Dis

tanc

e =

10

0 0.5 10

0.2

0.4

0.6

0.8

1

0 0.5 10

0.2

0.4

0.6

0.8

1

S1S2aS2b

MaMb

ClasRankMCD

Figure 3: PP-plots comparing the distribution of the test statistics under the null hypothesis

in contaminated versus non-contaminated samples, with n1 = n2 = n3 = 100, p = 6.

6

the S-based and MCD-based statistics are most resistant to the outliers in all cases while the

MM-based statistics are only slightly more affected. Neither the classical Wilks’ Lambda nor

its rank-transformed version are very robust, as is especially clear in Figure 3.

5 Power of the tests

Now, we compare the finite-sample power of the tests. We generate samples under the al-

ternative hypothesis Ha as in Section 7 of the manuscript. In particular, groups X1 to Xk−1

are generated according to N(0, Ip) as before, but group Xk is generated from N(µd, Ip) where

µd = (d, 0, . . . , 0) with d = 0.2, 0.5, 0.7, 1, 1.5 or 2. For k = 3 groups, we generated 1000 samples

for the same dimension and sample size combinations as in the previous section.

We first examine to what extent the test statistics can differentiate between H0 and Ha,

independently of how the p-values are estimated. This corresponds to the question of power in

case we would have the exact null distribution at our disposal. Figures 4 and 5 show probability-

probability plots of the empirical distribution under Ha versus that under H0. Figure 4 shows

the case n1 = n2 = n3 = 20 and p = 2 while Figure 5 shows the case n1 = n2 = n3 = 20 and

p = 6. For nj = 100, the power was consistently very high and these results are omitted here.

In both figures, the top row corresponds to d = 0.5 and the bottom row to d = 1. The

eight test statistics are distributed over the three plots as before, but the curve for the classical

Wilks’ Lambda is shown in each of the three plots for ease of comparison. The classical Wilks’

Lambda, being the likelihood ratio statistic under normality, obviously has the highest power.

However, the robust MM-based statistics are almost equally powerful. They are slightly better

than the rank-based statistic and also more powerful than the MCD-based test, especially

in higher dimensions as seen in Figure 5. The test statistics based on S-estimates clearly

have lower power than their MM-based counterparts, although their performance is better in

higher dimensions (it is well-known that S-estimates become rapidly more efficient in higher

dimensions).

We now examine the possible effect of outliers on the power of the tests. For each case, we

generate 1000 random samples under Ha as above, but the observations in Xk are contaminated

so that the classical test is tempted to accept the null hypothesis that Xk has the same mean

7

0 0.5 10

0.2

0.4

0.6

0.8

1d

= 0.

5

0 0.5 10

0.2

0.4

0.6

0.8

1

0 0.5 10

0.2

0.4

0.6

0.8

1

0 0.5 10

0.2

0.4

0.6

0.8

1

d =

1

ClasS1S2aS2b

0 0.5 10

0.2

0.4

0.6

0.8

1

ClasMaMb

0 0.5 10

0.2

0.4

0.6

0.8

1

ClasRankMCD

Figure 4: Size-power curves, comparing the distribution of the test statistics under Ha versus

under H0 for the case n1 = n2 = n3 = 20, p = 2.

0 0.5 10

0.2

0.4

0.6

0.8

1

d =

0.5

0 0.5 10

0.2

0.4

0.6

0.8

1

0 0.5 10

0.2

0.4

0.6

0.8

1

0 0.5 10

0.2

0.4

0.6

0.8

1

d =

1

0 0.5 10

0.2

0.4

0.6

0.8

1

0 0.5 10

0.2

0.4

0.6

0.8

1

ClasS1S2aS2b

ClasMaMb

ClasRankMCD

Figure 5: Size-power curves, comparing the distribution of the test statistics under Ha versus

under H0 for the case n1 = n2 = n3 = 20, p = 6.

8

as the other groups. This is accomplished by drawing the observations in Xk from

(1− ϵ)N(µd, Ip) + ϵ N(µ−d/ϵ, Ip)

with µd = (d, 0, . . . , 0) as before and similarly µ−d/ϵ = (−d/ϵ, 0, . . . , 0). This is indeed the

worst-case contamination scenario for the power of the classical Wilks’ Lambda, while it likely

is approximately worst-case for the robust tests. We again take ϵ = 0.10 and consider d =

0.5, 0.7, 1, 1.5 and 2.

Figure 6 shows the observed power under contamination for sample sizes n1 = n2 = n3 = 20

in p = 2 and p = 6 dimensions. Hence, these results can be compared to the power found

without contamination in Figure 5 of the manuscript. First note that, not surprisingly, the

classical test shows complete power breakdown. It can further be seen that all the robust

alternatives have only a small loss of their power. Moreover, the outliers have not affected the

performance order of the robust tests, with MMΛan and MMΛb

n yielding the most powerful

tests.

From the influence functions in Theorem 2 of the manuscript, we know that the MM-

based statistics have a larger bias than the S-based robust test statistics for intermediate

outliers. For larger samples, their larger bias may weaken the MM-based tests in comparison

to its competitors. This can be seen in Figure 7 which show the results for sample sizes

n1 = n2 = n3 = 100 in p = 2 and p = 6 dimensions respectively. While the power now

converges more rapidly to 1 as the distance d increases, it is noted that the S and MCD tests

enjoy better power for small d. Indeed, these estimators have a higher variance but a lower bias

than the MM-estimators for close by outliers. Especially, for large n this bias effect obviously

prevails yielding a disadvantage for the MM-based tests.

6 Example: Biting flies

We consider the Biting flies data from Johnson and Wichern (1998), consisting of two groups

of 35 flies (Leptoconops torrens and Leptoconops carteri), on which 7 characteristics were

measured. The second group contains an obvious outlier in the second variable (wing width)

as can be seen from the boxplot in Figure 8. It appears that for this variable the location of

the second group (L.carteri) is higher than that of the first group, but the second group has

9

0 0.5 1 1.5 20

0.2

0.4

0.6

0.8

1

d0 0.5 1 1.5 2

0

0.2

0.4

0.6

0.8

1

d0 0.5 1 1.5 2

0

0.2

0.4

0.6

0.8

1

d

ClasS1S2aS2b

ClasMaMb

ClasRankMCD

0 0.5 1 1.5 20

0.2

0.4

0.6

0.8

1

d0 0.5 1 1.5 2

0

0.2

0.4

0.6

0.8

1

d0 0.5 1 1.5 2

0

0.2

0.4

0.6

0.8

1

d

ClasS1S2aS2b

ClasMaMb

ClasRankMCD

Figure 6: Power for the case n1 = n2 = n3 = 20 with 10% outliers in the last group. The top

panel considers the case p = 2 for Ha : µ3 = (d, 0) . The bottom panel considers the case p = 6

for Ha : µ3 = (d, 0, . . . , 0).

one extremely low outlier with value 19 which affects the sample mean of this group.

Let us start with the MANOVA tests using all 7 variables. Because the sample size is small,

we use the 95% efficient MM-based test statistics with FRB based on B = 5000 bootstrap

samples to estimate the p-value of the tests. We compare the results to the classical test and

the MCD based test. Table 2 shows the p-values for each of the tests. From this table, we

immediately see that all tests clearly reject the null hypothesis of equal centers, although the

robust tests downweight the effect of the outlier and the classical Wilk’s lambda does not.

Hence, the outlier in the data had no negative effect on the overall classical Wilk’s lambda test

10

0 0.5 1 1.5 20

0.2

0.4

0.6

0.8

1

d0 0.5 1 1.5 2

0

0.2

0.4

0.6

0.8

1

d0 0.5 1 1.5 2

0

0.2

0.4

0.6

0.8

1

d

ClasS1S2aS2b

ClasMaMb

ClasRankMCD

0 0.5 1 20

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

d

0 0.5 1 20

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

d

0 0.5 1 20

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

d

ClasS1S2aS2b

ClasMaMb

ClasRankMCD

Figure 7: Power for the case n1 = n2 = n3 = 100 for the alternative Ha : µ3 = (d, 0) and with

10% outliers in the last group. The top panel considers the case p = 2 for Ha : µ3 = (d, 0) .

The bottom panel considers the case p = 6 for Ha : µ3 = (d, 0, . . . , 0).

in this case. This is a consequence of the large differences in sample means that are present for

some of the other variables in the dataset.

Clas MMΛan MMΛb

n MCD-test

p .0000 .0034 .0002 .0000

Table 2: p-values for the classical and robust MANOVA tests applied on the biting flies data

to investigate a mean difference between the two species.

11

20 25 30 35 40 45 50

1

2

wing width

grou

p

Figure 8: Biting flies data: boxplots of wing width measurements in both groups.

Let us now consider the simple setting of univariate one-way ANOVA for the wing width

measurements which may be part of a post hoc analysis to find out in which variables the two

groups differ. Note that, also in the univariate case outlier-robust testing for equality of means

has received quite little attention in the literature and statistical software. The tests proposed

in this paper are also useful for this setting. To illustrate the robustness advantage of our tests

we again compare them to the classical test (which here simplifies to a two-sample t-test). The

t-test yields a p-value of 0.398. To illustrate the effect of the outlier in this case, we repeat

the analysis without the outlier. After removing the outlier (observation x2,1) the t-test finds

a significant difference in means, with p = 0.021. The corresponding empirical means can be

found in the left panel of Table 3.

sample means and t-test MM-centers and MMΛn test

µ1 µ2 p µ1 µ2 p

full data 42.91 43.74 .398 42.98 44.56 .021

x2,1 removed 42.91 44.47 .021 42.99 44.57 .023

Table 3: Location estimates for the two groups of biting flies and the corresponding p-values for

the classical t-test and the robust MM-based test, both before and after removal of the outlier.

Clearly, the outlier now critically influenced the classical test. Therefore, let us now consider

the robust MM-based tests. Note that for univariate data the test statistics (11) and (12)

coincide, hence we denote it by MMΛn in this case. Table 3 also contains the location MM-

estimates. It can be seen that the outlier has very little effect on the location MM-estimates,

as expected. Figure 9 shows the histogram of the bootstrapped test statistics MMΛ∗n. The

12

value of the test statistic in the original sample was MMΛn = 0.939 and is indicated in Figure

9 by the vertical line. In 106 out of the 5000 bootstrap samples, we obtained MMΛ∗n < 0.939,

which yields p̂ = 0.021 according to (24). We thus find that the robust test behaves similarly

to the classical test after removal of the outlier. For the data without the outlier, the estimated

p-value of the MM-based test becomes p̂ = 0.023, which shows that the outlier also has very

limited influence on the robust test.

0.85 0.9 0.95 10

500

1000

1500

2000

MM−Λn*

Figure 9: Biting flies data: histogram of bootstrapped MMΛn values; vertical line indicates

the MM-based test statistic in the original sample

Appendix

Proof of Proposition 1

Let us first consider one-sample S-estimators (µ̂(1)F , Σ̂

(1)F ), i.e. k = 1. Due to affine equiv-

ariance of the estimators, we only have to consider the case µ = 0 and Σ = I, i.e. F = F0,I .

Clearly, taking b0 = EF0,I[ρ0(∥x∥)] implies that the choice (0, I) satisfies the constraint (2) of

the manuscript and therefore is a candidate solution. Now suppose that there exists another

solution (m,S) ̸= (0, I) that satisfies constraint (2) in the manuscript with |S| ≤ |I| = 1. Let

us decompose S into a scale part s and shape part G as before, i.e. S = s2G with |G| = 1 and

13

s ≤ 1, then from constraint (2) in the manuscript, we obtain

b0 = EF0,I

(ρ0

([(x−m)tS−1(x−m)]

12

))= EF0,I

(ρ0

([(x−m)tG−1(x−m)]

12 /s))

≥ EF0,I

(ρ0

([(x−m)tG−1(x−m)]

12

)),

where the last inequality holds because ρ0 is nondecreasing. Hence, to show that µ̂(1)F = 0 and

Σ̂(1)F = I, it suffices to show that for all (m,G) ̸= (0, I) with |G| = 1 we have that

EF0,I

(ρ0

([(x−m)tG−1(x−m)]

12

))> b0. (1)

We start by showing that for all z > 0

P ([(x−m)tG−1(x−m)]12 ≤ z) ≤ P (||x|| ≤ z) (2)

Let A1 = {[(x−m)tG−1(x−m)]12 ≤ z} , A2 = {||x|| ≤ z}, B = A1 ∩A2. Then, the left hand

side of (2) can be rewritten as

P ([(x−m)tG−1(x−m)]12 ≤ z) =

∫A1

dF0,I(x) =

∫A1∩A2

f(xtx) dx+

∫A1\A2

f(xtx) dx

≤∫A1∩A2

f(xtx) dx+ f(z2)

∫A1\A2

dx

=

∫A1∩A2

f(xtx) dx+ f(z2)

∫A2\A1

dx

≤∫A1∩A2

f(xtx) dx+

∫A2\A1

f(xtx) dx

=

∫A2

dF0,I(x) = P (||x|| ≤ z)

The inequality in the second step holds because f is non-increasing and by noting that ||x|| > z

for all x ∈ A1 \ A2. The equality in the third steps holds because |G| = |I| = 1 implies that

vol(A1) = vol(A2) and thus also vol(A1 \ A2) = vol(A2 \ A1), where vol stands for volume.

Similarly as in the second step, the inequality in the fourth step follows by noting that ||x|| ≤ z

for all x ∈ A2 \A1 and f is non-increasing.

Because f is assumed to be strictly decreasing at zero, a similar derivation shows that for

(m,G) ̸= (0, I), there exists ε > 0 such that for z ≤ ε we have

P ([(x−m)tG−1(x−m)]12 ≤ z) < P (||x|| ≤ z) (3)

14

Results (2)-(3), together with the conditions on ρ0 imply that ρ0([(x −m)tG−1(x −m)]12 ) is

stochastically larger than ρ0((||x||), which immediately yields (1) and hence Fisher-consistency

of the one-sample S-estimators.

Now consider k-sample S-estimators. Since the S-estimators only use within-group dis-

tances, they are affine equivariant, and thus we only have to consider the case µj = 0 (j =

1, . . . , k) and Σ = I, i.e. F = F0,I for j = 1, . . . , k. As before, taking b0 = EF0,I[ρ0(∥x∥)]

implies that the choice (0, . . . ,0, I) satisfies the constraint (3) of the manuscript and therefore

is a candidate solution. As in the one-sample case, to show that there does not exist another

solution (m1, . . . ,mk,S) ̸= (0, . . . ,0, I) with |S| ≤ |I| = 1 that satisfies constraint (3) of the

manuscript, we need to show that

k∑j=1

πjEF0,I

(ρ0

([(x−mj)

tG−1(x−mj)]12

))> b0, (4)

where S = s2G with |G| = 1 as before. The derivation for the one-sample case above shows

that for each of the expectations in (4) we have that

EF0,I

(ρ0

([(x−mj)

tG−1(x−mj)]12

))> b0 (5)

if (mj ,S) ̸= (0, I), which immediately implies (4) and hence Fisher-consistency of the k-sample

S-estimators.

The k-sample MM-estimators are also based on within-group distances, so they are affine

equivariant, and thus we again only have to consider the case µj = 0 (j = 1, . . . , k) and Σ = I.

The k sample MM-estimators (µ̃(k)1,F , . . . , µ̃

(k)g,F , Γ̃

(k)F ) then minimize

k∑j=1

πjEF0,I

(ρ1

([(x−mj)

tG−1(x−mj)]12 /σ̂

(k)F

))with |G|=1. Here σ̂

(k)F is the k-sample S-scale which equals 1 due to Fisher-consistency of the

k-sample S-estimators. Hence, to show Fisher-consistency of the k-sample MM-estimators, we

need that

k∑j=1

πjEF0,I

(ρ1

([(x−mj)

tG−1(x−mj)]12

))> EF0,I

[ρ1(∥x∥)] = b1 (6)

for all (m1, . . . ,mk,G) ̸= (0, . . . ,0, I) with |G| = 1. Similarly as for ρ0, results (2)-(3), together

with the conditions on ρ1 imply that for j = 1, . . . , k

EF0,I

(ρ1

([(x−mj)

tG−1(x−mj)]12

))> b1 (7)

15

if (mj ,G) ̸= (0, I), which immediately implies (6) and hence Fisher-consistency of the k-sample

MM-estimators.

Proof of Theorem 1 of the manuscript:

Because of equivariance we assume, without loss of generality, that Σ = Ip and µ = 0. First

note that the delta method implies that

n(1− SΛ1n) =

1

2pn

(1− |Σ̂(k)|

|Σ̂(1)|

)+ op(1) (8)

In the following we drop the subscript n in the notation of the estimates, for convenience. To

simplify the derivation, we first rewrite the right hand side of(8) as follows

n

(1− |Σ̂(k)|

|Σ̂(1)|

)= n

(1− |(Σ̂(1)

)−1|

|(Σ̂(k))−1|

)=

1

|(Σ̂(k))−1|

n (|(Σ̂(k))−1| − |(Σ̂(1)

)−1|) (9)

Furthermore, notice that by considering the following Taylor expansion of |(Σ̂(1))−1|

|(Σ̂(1))−1| = 1 + tr((Σ̂

(1))−1 − Ip) + op(n

−1)

and similarly for |(Σ̂(k))−1|, we can rewrite the difference in determinants in the right hand

side of (9) as

|(Σ̂(k))−1| − |(Σ̂(1)

)−1| = tr((Σ̂(k)

)−1 − (Σ̂(1)

)−1) + op(n−1). (10)

Hence this allows us to work with traces rather than determinants, in the remainder.

We now expand the following two representations:

1

n

n∑i=1

ρ0

([(xi − µ̂(1))t(Σ̂

(1))−1(xi − µ̂(1))]1/2

)= b0 (11)

1

n

k∑j=1

∑i∈Πj

ρ0

([(xi − µ̂

(k)j )t(Σ̂

(g))−1(xi − µ̂

(k)j )]1/2

)= b0 (12)

where µ̂(1) denotes the one-sample S-estimate and µ̂(k)1 , . . . , µ̂

(k)k the k-sample S-estimates.

The second-order expansion of (11) is given by

b0 =1

n

n∑i=1

ρ0(∥xi∥)−

(1

n

n∑i=1

ψ0(∥xi∥)∥xi∥

xti

)(µ̂(1) − µ) + tr

[(1

n

n∑i=1

ψ0(∥xi∥)2∥xi∥

xixti

)((Σ̂

(1))−1 − Ip)

]

+1

2(µ̂(1) − µ)t

(1

n

n∑i=1

(ψ′0(∥xi∥)∥xi∥

− ψ0(∥xi∥)∥xi∥2

)xix

ti

∥xi∥+

1

n

n∑i=1


Ip

)(µ̂(1) − µ)

+An[µ, (Σ̂(1)

)−1] +Bn[(Σ̂(1)

)−1] + op(n−1)

16

Here, An[µ, (Σ̂(1)

)−1] represents the mixed second order term and Bn[(Σ̂(1)

)−1] represents the

term involving the second derivative w.r.t. (Σ̂(1)

)−1. The former can be shown to be of order

op(n−1) and may therefore be dropped from the further derivation.

We have

1

n

n∑i=1

(ψ′0(∥xi∥)∥xi∥

− ψ0(∥xi∥)∥xi∥2

)xix

ti

∥xi∥+

1

n

n∑i=1


Ipa.s.−→ β0 Ip

and

1

n

n∑i=1

ψ0(∥xi∥)2∥xi∥

xixti

a.s.−→ γ02p

Ip

Hence, we can write

γ02pn tr

[(Σ̂

(1))−1 − Ip

]= n(b0 −

1

n

n∑i=1

ρ0(∥xi∥)) + n

(1

n

n∑i=1


xti

)(µ̂(1) − µ)

−1

2β0n (µ̂

(1) − µ)t(µ̂(1) − µ) + nBn[(Σ̂(1)

)−1] + op(1)

Similarly we expand (12) and then take the difference, which yields

γ02pn tr

[(Σ̂

(k))−1 − (Σ̂

(1))−1]= n

k∑j=1

πj

1

nj

∑i∈Πj


xti

(µ̂(k)j − µ̂(1))

−1

2β0 n

k∑j=1

πj(µ̂(k)j − µ)t(µ̂

(k)j − µ) +

1

2β0 n (µ̂

(1) − µ)t(µ̂(1) − µ)

+nBn[(Σ̂(k)

)−1]− nBn[(Σ̂(1)

)−1] + op(1)

It can be shown that the difference Bn[(Σ̂(k)

)−1] − Bn[(Σ̂(1)

)−1] is of order op(n−1), so also

this term can henceforth be omitted.

We now use the first-order approximation for the location estimates:

√n(µ̂(1) − µ) =

1

β0

√n

(1

n

n∑i=1


xi

)+ op(1),

√n(µ̂

(k)j − µ) =

1

β0

√n

1

nj

∑i∈Πj


xi

+ op(1) j = 1, . . . , k.

Direct use of these approximations yields

1

2pγ0 n tr

[(Σ̂

(k))−1 − (Σ̂

(1))−1]

=1

2

1

β0

n k∑j=1

πj

1

nj

∑i∈Πj


xi

t 1

nj

∑i∈Πj


xi

−n

(1

n

n∑i=1


xi

)t(1

n

n∑i=1


xi

)]+ op(1).

17

Now, denote

Zj =1

nj

∑i∈Πj


xi (j = 1, . . . , k)

such that we can write

β0γ0p

n tr[(Σ̂

(k))−1 − (Σ̂

(1))−1]=

n k∑j=1

πjZtjZj − n

k∑j=1

πjZj

t k∑j=1

πjZj

+ op(1).

(13)

Consider the following (Helmert-type) transformation for these variables:

Y1 =k∑

j=1

πjZj

Y2 = (π1Z1 − π1Z2)1√

π1 + π21/π2

Y3 = (π1Z1 + π2Z2 − (π1 + π2)Z3)1√

π1 + π2 + (π1 + π2)2/π3...

...

Yk =

k−1∑j=1

πjZj −

k−1∑j=1

πj

Zk

1√∑k−1j=1 πj + (

∑k−1j=1 πj)

2/πk

It can be shown that this is an orthogonal transformation, such that the Yj are independent

andk∑

j=1

πjZtjZj =

k∑j=1

YtjYj

Noting that the second term on the right-hand side in (13) equals nYt1Y1, we may rewrite the

equation as

β0γ0p

n tr[(Σ̂

(k))−1 − (Σ̂

(1))−1]=

n k∑j=2

YtjYj

+ op(1) (14)

(note that the sum starts from j = 2 instead of j = 1). For n→ ∞ we have that

ZjD−→ N(0,

1

πjnα0Ip),

independently for each j = 1, . . . , k, where α0 =1pEF0,I

[ψ20(∥x∥)]. Hence,

YjD−→ N(0,

1

nα0Ip),

independently for each j = 1, . . . , k. We therefore have in equation (14) a sum of squares of

p(k − 1) independent (asymptotically) normally distributed variables, leading to the result

n tr[(Σ̂

(k))−1 − (Σ̂

(1))−1]=EF0,I

[ψ20(∥x∥)]

γ0 β0χ2p(k−1) + op(1).

18

Then by (8), (10), and since the remaining factor in (9) converges almost surely to 1 by the

assumption of Σ = Ip, we finally have

n(1− SΛ1n) =

EF0,I[ψ2

0(∥x∥)]2pγ0 β0

χ2p(k−1) + op(1).

The proofs for the other test statistics proceed along the same lines.


The result requires taking second derivatives of the statistics, which is straightforward but

tedious. To keep the notation simple, we sketch the derivation here for the univariate case

only (p = 1). We write the test statistics explicitly as functionals, i.e. as functions of the

distribution F . In the univariate case, the functionals of the test statistics become

SΛ1(F ) =σ(k)(F )

σ(1)(F )

SΛ2.(F ) =

∑kj=1 πj EFj

[ρ0((x− µ

(k)j (F ))/σ(k)(F ))

]EF1

[ρ0((x− µ(1)(F ))/σ(k)(F ))

]MMΛ.(F ) =

∑kj=1 πj EFj

[ρ1((x− µ̃

(k)j (F ))/σ(k)(F ))

]EF1

[ρ1((x− µ̃(1)(F ))/σ(k)(F ))

]We assume that k = 2 (the derivation is completely analogous in case k > 2) and recall that

the contamination is placed in group 1. For the first test statistic, note that

∂2

(∂ϵ)2SΛ1(Fϵ,y)|ϵ=0 =

∂2

(∂ϵ)2σ(k)(Fϵ,y)|ϵ=0 −

∂2

(∂ϵ)2σ(1)(Fϵ,y)|ϵ=0 (15)

because σ(k)=σ(1) = 1 and the first-order influence function (IF) of σ(k) and σ(1) coincide, i.e.

∂

∂ϵσ(k)(Fϵ,y)|ϵ=0 =

∂

∂ϵσ(1)(Fϵ,y)|ϵ=0.

The scale functionals σ(1) and σ(k) are defined by (the univariate equivalent of) (2) and (3) in

the manuscript. Their second-order derivatives can be obtained by taking the second derivative

of these equations. For instance, for σ(k) we find that

π1∂2

(∂ϵ)2

((1− ϵ)E

[ρ0(

x− µ(k)1 (Fϵ,y)

σ(k)(Fϵ,y))

]+ ϵρ0(

y − µ(k)1 (Fϵ,y)

σ(k)(Fϵ,y))

)+π2

∂2

(∂ϵ)2E

[ρ0(

x− µ(k)2 (Fϵ,y)

σ(k)(Fϵ,y))

]= 0

where the expectation is always over F0,1. From this expression we obtain that

∂2

(∂ϵ)2σk(Fϵ,y)|ϵ=0 =

−1

E[ψ0(x)x]

(−π1(E[2ψ0(x)x+ ρ0(x)]− (2ψ0(y)y − ρ0(y)))IF (y, σ

(k), F )

19

+2π1ψ0(y)IF (y, µ(k)1 , F )− π1E[ψ0(x)] IF (y, µ

(k)1 , F )2

−E[ψ′0(x)x

2 + ψ0(x)x] IF (y, σ(k), F )2

)(16)

A similar expression holds for σ(1). From the first-order derivatives of equations (2) and (3) in

the manuscript we find that

IF (y, σ(k), F ) = IF (y, σ(1), F ) =π1(ρ0(y)− b0)

E[ψ0(x)x]

IF (y, µ(k)1 , F ) =

ψ0(y)

E[ψ′0(x)]

IF (y, µ(1), F ) =π1ψ0(y)

E[ψ′0(x)]

The second-order IF for SΛ1 now follows through (15).

For SΛ2.(F ), note that the numerator actually equals b0 according to (3) in the manuscript,

and write SΛ2.(Fϵ,y) = b0/V (ϵ), where V (0) = b0, such that we have

∂2

(∂ϵ)2SΛ2.(Fϵ,y)|ϵ=0 = − 1

b0

∂2

(∂ϵ)2V (ϵ)|ϵ=0 (17)

The second derivative of V (ϵ) at 0 is

∂2

(∂ϵ)2V (ϵ)|ϵ=0 = 2π21 (E[ψ0(x)x]− ψ0(y)y)

(ρ0(y)− b0)

E[ψ0(x)x]− π21

ψ20(y)

E[ψ′0(x)]

+π2iE[ψ′0(x)x

2 + ψ0(x)x](ρ0(y)− b0)

2

E[ψ0(x)x]2− E[ψ0(x)x]IF2(y, σ

(k), F )

where IF2(y, σ(k), F ) is given by (16). The second-order IF for SΛ2. then follows by (17).

In MMΛ. the numerator is not constant anymore, but the derivation of the second-order

IF is analogous to that of SΛ2..


The test statistic Λ is asymptotically χ2q according to (13) in the manuscript and up to O(1/n)

we have that α(Fϵ,n) = 1−Hq(η1−α0 , δ(ϵ)) where δ(ϵ) = nΛ(Fϵ,n). Let b(ϵ) = −Hq(η1−α0 , δ(ϵ)),

then we have up to O(1/n) that

α(Fϵ,n)− α0 = b(ϵ)− b(0) = ϵ b′(0) +ϵ2

2b′′(0) + o(ϵ2).

A second order von Mises expansion of Λ(Fϵ,n) yields

Λ(Fϵ,n) = Λ(F ) +ϵ√n

∫ξ1(x) dG(x) +

1

2

ϵ2

n

∫∫ξ2(x,y) dG(x) dG(y) + o(ϵ2/n) (18)

20

with ξ1(x) = IF (x,Λ, F ) = 0 (see e.g. Fernholz 2001, Gatto and Ronchetti 1996). From (18)

we immediately obtain b′(0) = κ ∂δ∂ϵ |ϵ=0 = nκ

∂Λ(Fϵ,n)∂ϵ |ϵ=0 = 0, and

b′′(0) = κ∂2δ

∂ϵ2|ϵ=0 = nκ

∂2Λ(Fϵ,n)

∂ϵ2|ϵ=0 = κ

∫∫ξ2(x,y) dG(x) dG(y).

For G = ∆y this expression reduces to b′′(0) = κξ2(y,y) = κ IF2(y,Λ, F ).


Analogously to Θ̂R∗n , define

Θ̂Rn := Θ+ [I−∇gn(Θ)]−1(gn(Θ)−Θ).

where Θ denotes the limiting values of Θ̂n as before. It can be shown, as in Salibian-Barrera

et al. (2006), that

Θ̂Rn − Θ̂n = op(n

−1/2). (19)

Let h denote the limiting version of the statistic hn. Condition (21) in the manuscript means

that ∇h(Θ) = 0 under the null hypothesis. Since Θ̂n are root-n consistent estimates for Θ,

this implies that

∇hn(Θ̂n) = Op(n−1/2). (20)

Therefore, considering the expansion

n(hn(Θ̂Rn )− hn(Θ)) = n(hn(Θ̂n)− hn(Θ)) +∇hn(Θ̂n)n(Θ̂

Rn − Θ̂n) +Rn

we may conclude that

n(hn(Θ̂Rn )− hn(Θ)) = n(hn(Θ̂n)− hn(Θ)) + op(1). (21)

For the left-hand side of (21), we can write

n(hn(Θ̂Rn )− hn(Θ)) = n∇hn(Θ)(Θ̂

Rn −Θ) + n(Θ̂

Rn −Θ)tHn(Θ)(Θ̂

Rn −Θ) + op(1) (22)

where Hn(.) is the Hessian matrix corresponding to hn(.). It follows from Salibian-Barrera et

al.(2006, Theorem 2) that

(Θ̂R∗n − Θ̂n) = (Θ̂

Rn −Θ) + op(n

−1/2), (23)

21

where the left-hand side is considered conditionally on the sample. By (22) and (23), and since

∇hn(Θ) = Op(n−1/2), we can write

n(hn(Θ̂Rn )− hn(Θ)) = n∇hn(Θ)(Θ̂

R∗n − Θ̂n) +n(Θ̂

R∗n − Θ̂n)

tHn(Θ)(Θ̂R∗n − Θ̂n) + op(1) (24)

The asymptotic validity of the bootstrap (see e.g. Bickel and Freedman 1981) implies that we

similarly have

n(h∗n(Θ̂Rn )−h∗n(Θ̂n)) = n∇h∗n(Θ)(Θ̂

R∗n − Θ̂n)+n(Θ̂

R∗n − Θ̂n)

tH∗n(Θ)(Θ̂

R∗n − Θ̂n)+op(1) (25)

where ∇h∗n(Θ) and H∗n(Θ) converge to the same limits as ∇hn(Θ) and Hn(Θ). Hence,

n(hn(Θ̂Rn )− hn(Θ)) = n(h∗n(Θ̂

R∗n )− h∗n(Θ̂n)) + op(1) (26)

and thus by (21)

n(hn(Θ̂n)− hn(Θ)) = n(h∗n(Θ̂R∗n )− h∗n(Θ̂n)) + op(1) (27)

Since both

hn(Θ) = 1 + op(1/n) and h∗n(Θ̂n) = 1 + op(n

−1) (28)

we obtain

n(hn(Θ̂n)− 1) = n(h∗n(Θ̂R∗n )− 1) + op(1), (29)

which was to be shown. Note that the equality on the left of (28) follows from condition (21)

in the manuscript and because after transformation to null data, we have that hn(Θ̂n) = 1.

The equality on the right of equation (28) is immediately true in the case of S-estimates

because of the equations regarding the estimates after translation as explained in Section 4 of

the manuscript. In the case of MM-estimates, however, we have for the translated data that

µ̃(k,0)j,n = µ̃

(1)n , j = 1, . . . , k but generally µ̃

(1,0)n ̸= µ̃

(1)n , and similarly for the shape estimates. In

this case the result in (28) follows since

µ̃(1,0)n − µ̃

(k,0)1,n = op(n

−1/2) (30)

and also Γ̃(1,0)n − Γ̃

(k,0)n = op(n

−1/2). For example, for statistic Tn(.) = MMΛbn we can then

write (by considering h∗n(Θ̂n) as a function of µ̃(1,0)n only and expanding it around µ̃

(k,0)1,n )

h∗n(Θ̂n) = 1 +∂h∗n

∂µ̃(1,0)n

(µ̃(1,0)n − µ̃

(k,0)1,n ) + op(1)

22

and (28) follows since the derivative of h∗n is of order Op(n−1/2).

To show (30), one may proceed by noting that the estimates must satisfy the respective

first order equations

k∑j=1

∑i∈Πj

ρ′1(di(µ̃(k,0)j,n , Σ̂

(k,0)n ))

di(µ̃(k,0)j,n , Σ̂

(k,0)n )

(xi − µ̃(k,0)j,n ) = 0

andn∑

i=1

ρ′1(di(µ̃(1,0)n , Σ̂

(1,0)n ))

di(µ̃(1,0)n , Σ̂

(1,0)n )

(xi − µ̃(1,0)n ) = 0

where di(µ,Σ) = [(xi − µ)tΣ−1(xi − µ)]1/2. Expanding the first equation around Σ̂(1,0)n and

the second one around µ̃(k,0)1,n leads to

(µ̃(1,0)n − µ̃

(k,0)1,n ) = Anvec((Σ̂

(k,0)n − Σ̂

(1,0)n )) + op(n

−1/2)

with An = Op(n−1/2), which suffices for the right equality of (30) to hold since Σ̂

(k,0)n = Σ̂

(1,0)n .

In a similar but more elaborate way, we can show Γ̃(1,0)n − Γ̃

(k,0)n = op(n

−1/2), which is required

to establish the equality in (28) for the other MM-based test statistic.

References

Alqallaf, F., Van Aelst, S., Yohai, V.J. and Zamar, R.H. (2009), “Propagation of outliers in

multivariate data,” Annals of Statistics, 37, 311-331.

Bickel, P.J., and Freedman, D.A. (1981), “Some asymptotic theory for the bootstrap,” The

Annals of Statistics, 9, 1196–1217.

Fernholz, L.T. (2001), “On multivariate higher order von Mises expansions,” Metrika, 53, 123-

140.

Gatto, R. and Ronchetti, E. (1996), “General saddlepoint approximations of marginal densities

and tail probabilities,” Journal of the American Statistical Association, 91, 666-673.

23

Robust and E cient One-way MANOVA Tests - Supplemental ... · MANOVA test statistic, and the MCD test statistic. The simulation setting is the same as in Section 6 of the manuscript.

Documents