Simple correspondence analysis (CA), Multiple - CARME 2011

Michael GreenacreUniversitat Pompeu Fabra

Barcelona

Michael GreenacreUniversitat Pompeu Fabra

Barcelona

Simple correspondence analysis (CA),Multiple correspondence analysis (MCA),

Joint correspondence analysis (JCA), as well as all subset versions of these,

using R

package ca.

Oleg Nenadić

& Michael GreenacreUniversity of Göttingen

& Universitat

Pompeu

Fabra

View of Aegean Sea and island of Lesbos. Turkey, August 2010.

AssosVenue

for

CARME in ASSOS

ca

package

function

ca function

mjca

(simple) correspondence analysis

(CA)

multiple

correspondence

analysis (MCA)

adjusted

MCA

joint correspondence analysis (JCA)

subset CA, subset MCA, adjusted

subset MCA, subset JCA

subset versions

Contribution coordinates

-0.6 -0.4 -0.2 0.0 0.2 0.4 0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

-6 -4 -2 0 2 4 6

-4-2

02

46

8

••

•

•

•••

•

•

•••

•

•

•

•••

• •••••

•••

• •

••

•

•

••

•

•• •

• ••14:0

14:1(n-5)

i-15:0a-15:015:0

15:1(n-6)

i-16:0

16:0

16:1(n-9)

16:1(n-7)

16:1(n-5)

i-17:0

a-17:0

16:2(n-4)

17:0

16:3(n-4)

16:4(n-1)

18:0

18:1(n-9)

18:1(n-7)

18:2(n-6)

18:3(n-6)

18:3(n-3)

18:4(n-3)

20:0

20:1(n-11)

20:1(n-9)20:1(n-7)

20:2(n-6)

20:3(n-6)20:4(n-6)

20:3(n-3)

20:4(n-3)20:5(n-3)

22:1(n-11)22:1(n-9)22:1(n-7)22:5(n-3)22:6(n-3)

24:1(n-9)

-0.6 -0.4 -0.2 0.0 0.2 0.4 0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

••

•

•

•••

•

•

•••

•

•

•

•••

• •••••

•••

• •

••

•

•

••

•

•• •

• ••

16:1(n-7)

18:0

18:4(n-3)

20:1(n-9)

20:5(n-3)

22:1(n-11)

asymmetric

map: map="rowprincipal"

contribution

coordinates: map="rowgreen"

See Biplots in Practice (Greenacre

2010) www.multivariatestatistics.org

Problem of variance explained> summary(mjca(wg93[,1:4], lambda="indicator"))

Principal inertias

(eigenvalues):

dim

value

% cum% scree

plot1 0.457379 11.4 11.4 *************************2 0.430966 10.8 22.2 *********************** 3 0.321926 8.0 30.3 *************** : : : :

> summary(mjca(wg93[,1:4], lambda="Burt"))Principal inertias

(eigenvalues):dim

value

% cum% scree

plot1 0.209196 18.6 18.6 *************************2 0.185732 16.5 35.0 ********************** 3 0.103636 9.2 44.2 *********** : : : :> summary(mjca(wg93[,1:4], lambda="adjusted")))) #DEFAULTPrincipal inertias

(eigenvalues):

dim

value

% cum% scree

plot1 0.076455 44.9 44.9 *************************2 0.058220 34.2 79.1 ******************* 3 0.009197 5.4 84.5 *** : : : : > summary(mjca(wg93[,1:4]), lambda="JCA"))Percentage explained by JCA in 2 dimensions: 85.7%(Eigenvalues

are not nested)[Iterations in JCA: 44 , epsilon = 9.91e-05]

increasinginertia

explained

Same problem for individual points> summary(mjca(wg93[,1:4], lambda="Burt"))Principal inertias

(eigenvalues):dim

value

% cum% scree

plot1 0.209196 18.6 18.6 *************************2 0.185732 16.5 35.0 ********************** 3 0.103636 9.2 44.2 *********** : : : : :

name

mass

qlt

inr

k=1 cor

ctr

k=2 cor

ctr1 | A1 | 34 445 55 | -840 391 53 | -314 54 8 |2 | A2 | 92 169 38 | -250 136 13 | 123 33 3 |3 | A3 | 59 344 47 | 204 47 5 | 517 298 36 |4 | A4 | 51 350 50 | 533 258 32 | -318 92 12 |5 | A5 | 14 401 60 | 913 170 25 | -1064 231 36 |6 | B1 | 20 621 62 | -1338 519 80 | -590 101 16 |7 | B2 | 50 158 47 | -293 80 9 | 287 77 10 |8 | B3 | 59 227 45 | -158 29 3 | 415 198 24 |9 | B4 | 81 210 41 | 327 185 19 | 121 25 3 |10 | B5 | 40 722 60 | 619 229 34 | -908 493 77 |11 | C1 | 44 732 60 | -987 632 93 | -392 100 16 |12 | C2 | 91 164 38 | -113 27 3 | 255 137 14 |13 | C3 | 57 296 48 | 283 84 10 | 450 212 27 |14 | C4 | 44 345 52 | 617 289 37 | -274 57 8 |15 | C5 | 15 471 60 | 671 99 15 | -1300 372 59 |16 | D1 | 17 251 56 | -551 83 11 | -785 168 25 |17 | D2 | 67 14 42 | 101 14 1 | 3 0 0 |18 | D3 | 58 303 48 | 176 33 4 | 499 269 34 |19 | D4 | 65 25 43 | 101 14 1 | 91 11 1 |20 | D5 | 43 272 50 | -324 81 10 | -496 191 25 |

Same problem for individual points> summary(mjca(wg93[,1:4], lambda="Burt"))Principal inertias

(eigenvalues):dim

value

% cum% scree

plot1 0.209196 18.6 18.6 *************************2 0.185732 16.5 35.0 ********************** 3 0.103636 9.2 44.2 ***********

> mjca(wg93[,1:4])$BurtA1 A2 A3 A4 A5 B1 B2 B3 B4 B5 C1 C2 C3 C4 C5 D1 D2 D3 D4 D5

A1 119 0 0 0 0 27 28 30 22 12 49 40 18 7 5 15 25 17 34 28A2 0 322 0 0 0 38 74 84 96 30 67 142 60 41 12 22 102 76 68 54A3 0 0 204 0 0 3 48 63 73 17 18 75 70 34 7 10 44 68 58 24A4 0 0 0 178 0 3 21 23 79 52 16 50 40 56 16 9 52 28 54 35A5 0 0 0 0 48 0 3 5 11 29 2 9 9 16 12 4 9 13 12 10B1 27 38 3 3 0 71 0 0 0 0 43 19 4 3 2 9 17 10 10 25B2 28 74 48 21 3 0 174 0 0 0 36 88 34 15 1 16 51 42 45 20B3 30 84 63 23 5 0 0 205 0 0 37 90 57 19 2 10 53 63 51 28B4 22 96 73 79 11 0 0 0 281 0 27 88 75 74 17 6 66 70 92 47B5 12 30 17 52 29 0 0 0 0 140 9 31 27 43 30 19 45 17 28 31C1 49 67 18 16 2 43 36 37 27 9 152 0 0 0 0 25 24 15 38 50C2 40 142 75 50 9 19 88 90 88 31 0 316 0 0 0 15 97 67 89 48C3 18 60 70 40 9 4 34 57 75 27 0 0 197 0 0 5 51 83 41 17C4 7 41 34 56 16 3 15 19 74 43 0 0 0 154 0 6 44 30 51 23C5 5 12 7 16 12 2 1 2 17 30 0 0 0 0 52 9 16 7 7 13D1 15 22 10 9 4 9 16 10 6 19 25 15 5 6 9 60 0 0 0 0D2 25 102 44 52 9 17 51 53 66 45 24 97 51 44 16 0 232 0 0 0D3 17 76 68 28 13 10 42 63 70 17 15 67 83 30 7 0 0 202 0 0D4 34 68 58 54 12 10 45 51 92 28 38 89 41 51 7 0 0 0 226 0D5 28 54 24 35 10 25 20 28 47 31 50 48 17 23 13 0 0 0 0 151

Joint correspondence analysis

> mjca(wg93[,1:4], lambda="JCA")$Burt.upd

A1 A2 A3 A4 A5 B1 B2 B3 B4 B5 C1 C2 C3 C4 C5 D1 D2 D3 D4 D5A1 31 53 19 14 3 27 28 30 22 12 49 40 18 7 5 15 25 17 34 28A2 53 131 77 52 10 38 74 84 96 30 67 142 60 41 12 22 102 76 68 54A3 19 77 63 39 7 3 48 63 73 17 18 75 70 34 7 10 44 68 58 24A4 14 52 39 54 20 3 21 23 79 52 16 50 40 56 16 9 52 28 54 35A5 3 10 7 20 9 0 3 5 11 29 2 9 9 16 12 4 9 13 12 10B1 27 38 3 3 0 21 20 18 8 3 43 19 4 3 2 9 17 10 10 25B2 28 74 48 21 3 20 46 54 50 4 36 88 34 15 1 16 51 42 45 20B3 30 84 63 23 5 18 54 65 64 4 37 90 57 19 2 10 53 63 51 28B4 22 96 73 79 11 8 50 64 104 55 27 88 75 74 17 6 66 70 92 47B5 12 30 17 52 29 3 4 4 55 74 9 31 27 43 30 19 45 17 28 31C1 49 67 18 16 2 43 36 37 27 9 82 55 4 3 7 25 24 15 38 50C2 40 142 75 50 9 19 88 90 88 31 55 126 79 46 9 15 97 67 89 48C3 18 60 70 40 9 4 34 57 75 27 4 79 66 41 6 5 51 83 41 17C4 7 41 34 56 16 3 15 19 74 43 3 46 41 45 18 6 44 30 51 23C5 5 12 7 16 12 2 1 2 17 30 7 9 6 18 11 9 16 7 7 13D1 15 22 10 9 4 9 16 10 6 19 25 15 5 6 9 9 15 5 13 18D2 25 102 44 52 9 17 51 53 66 45 24 97 51 44 16 15 62 56 61 38D3 17 76 68 28 13 10 42 63 70 17 15 67 83 30 7 5 56 64 56 21D4 34 68 58 54 12 10 45 51 92 28 38 89 41 51 7 13 61 56 60 36D5 28 54 24 35 10 25 20 28 47 31 50 48 17 23 13 18 38 21 36 38

• default: two-dimensional solution• at convergence

the

diagonal blocks

are perfectly

fitted

Joint correspondence analysis> summary(mjca(wg93[,1:4], lambda="JCA"))Principal inertias

(eigenvalues):dim

value1 0.0990912 0.065033: :

--------Total: 0.182425

Diagonal inertia

discounted

from

eigenvalues: 0.0547405Percentage

explained

by JCA in 2 dimensions: 85.7%(Eigenvalues

are not

nested)[Iterations

in JCA: 44 , epsilon

= 9.91e-05]

857.00547405.0182425.0

0547405.0)065033.0099091.0(

Subset version

of

JCA available

in new

version: i.e., a subset of

the categories

is

specified, and

the

analysis

fits

these

optimally, using

the

original margins

of

the

Burt matrix, omitting

the

(subsets

of) categories

in the

diagonal blocks.

Adjusted MCA

-1.5 -1.0 -0.5 0.0 0.5 1.0

-1.0

-0.5

0.0

0.5

A1

A2

A3

A4

A5

B1

B2

B3

B4

B5

C1

C2

C3

C4

C5

D1

D2

D3

D4

D5

-0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6

-0.8

-0.6

-0.4

-0.2

0.0

0.2

0.4

A1

A2

A3

A4

A5

B1

B2B3

B4

B5

C1

C2

C3

C4

C5

D1

D2

D3

D4

D5

Burt: 1

, 2

, …

35% explained Adjusted: 1

*, 2

*, …

79% explained

22

2* )1(

)1( QQQ

ii

Adjusted MCA –

nullifying the Burt matrixA1 A2 A3 A4 A5 B1 B2 B3 B4 B5 C1 C2 C3 C4 C5 D1 D2 D3 D4 D5

A1 119 0 0 0 0 27 28 30 22 12 49 40 18 7 5 15 25 17 34 28A2 0 322 0 0 0 38 74 84 96 30 67 142 60 41 12 22 102 76 68 54A3 0 0 204 0 0 3 48 63 73 17 18 75 70 34 7 10 44 68 58 24A4 0 0 0 178 0 3 21 23 79 52 16 50 40 56 16 9 52 28 54 35A5 0 0 0 0 48 0 3 5 11 29 2 9 9 16 12 4 9 13 12 10B1 27 38 3 3 0 71 0 0 0 0 43 19 4 3 2 9 17 10 10 25B2 28 74 48 21 3 0 174 0 0 0 36 88 34 15 1 16 51 42 45 20B3 30 84 63 23 5 0 0 205 0 0 37 90 57 19 2 10 53 63 51 28B4 22 96 73 79 11 0 0 0 281 0 27 88 75 74 17 6 66 70 92 47B5 12 30 17 52 29 0 0 0 0 140 9 31 27 43 30 19 45 17 28 31C1 49 67 18 16 2 43 36 37 27 9 152 0 0 0 0 25 24 15 38 50C2 40 142 75 50 9 19 88 90 88 31 0 316 0 0 0 15 97 67 89 48C3 18 60 70 40 9 4 34 57 75 27 0 0 197 0 0 5 51 83 41 17C4 7 41 34 56 16 3 15 19 74 43 0 0 0 154 0 6 44 30 51 23C5 5 12 7 16 12 2 1 2 17 30 0 0 0 0 52 9 16 7 7 13D1 15 22 10 9 4 9 16 10 6 19 25 15 5 6 9 60 0 0 0 0D2 25 102 44 52 9 17 51 53 66 45 24 97 51 44 16 0 232 0 0 0D3 17 76 68 28 13 10 42 63 70 17 15 67 83 30 7 0 0 202 0 0D4 34 68 58 54 12 10 45 51 92 28 38 89 41 51 7 0 0 0 226 0D5 28 54 24 35 10 25 20 28 47 31 50 48 17 23 13 0 0 0 0 151

Adjusted MCA –

nullified Burt matrixA1 A2 A3 A4 A5 B1 B2 B3 B4 B5 C1 C2 C3 C4 C5 D1 D2 D3 D4 D5

A1 119 0 0 0 0

27 28 30 22 12 49 40 18 7 5 15 25 17 34 28A2 0 322 0 0 0 38 74 84 96 30 67 142 60 41 12 22 102 76 68 54A3 0 0 204 0 0 3 48 63 73 17 18 75 70 34 7 10 44 68 58 24A4 0 0 0 178 0 3 21 23 79 52 16 50 40 56 16 9 52 28 54 35A5 0 0 0 0 48 0 3 5 11 29 2 9 9 16 12 4 9 13 12 10B1 27 38 3 3 0 71 0 0 0 0

43 19 4 3 2 9 17 10 10 25B2 28 74 48 21 3 0 174 0 0 0

36 88 34 15 1 16 51 42 45 20B3 30 84 63 23 5 0 0 205 0 0

37 90 57 19 2 10 53 63 51 28B4 22 96 73 79 11 0 0 0 281 0

27 88 75 74 17 6 66 70 92 47B5 12 30 17 52 29 0 0 0 0 140

9 31 27 43 30 19 45 17 28 31C1 49 67 18 16 2 43 36 37 27 9 152 0 0 0 0

25 24 15 38 50C2 40 142 75 50 9 19 88 90 88 31 0 316 0 0 0

15 97 67 89 48C3 18 60 70 40 9 4 34 57 75 27 0 0 197 0 0

5 51 83 41 17C4 7 41 34 56 16 3 15 19 74 43 0 0 0 154 0

6 44 30 51 23C5 5 12 7 16 12 2 1 2 17 30 0 0 0 0 52

9 16 7 7 13D1 15 22 10 9 4 9 16 10 6 19 25 15 5 6 9 60 0 0 0 0D2 25 102 44 52 9 17 51 53 66 45 24 97 51 44 16 0 232 0 0 0D3 17 76 68 28 13 10 42 63 70 17 15 67 83 30 7 0 0 202 0 0D4 34 68 58 54 12 10 45 51 92 28 38 89 41 51 7 0 0 0 226 0D5 28 54 24 35 10 25 20 28 47 31 50 48 17 23 13 0 0 0 0 151

0

0

0

0

•

Perform

eigendecomposition

on

B0

(suitably

centred

& normalized, as in MCA)

•

The

POSITIVE eigenvalues

are exactly

the

adjusted

inertias•

Adjustments

for

each

category

obtained

in same

way

B0

=

Results for new version (default is “adjusted”)> summary(mjca(wg93[,1:4]))

Principal inertias

(eigenvalues):dim

value

% cum% scree

plot1 0.076455 44.9 44.9 *************************2 0.058220 34.2 79.1 ******************* 3 0.009197 5.4 84.5 ***

: : : : :name

mass

qlt

inr

k=1 cor

ctr

k=2 cor

ctr1 | A1 | 34 963 55 | 508 860 115 | -176 103 18 |2 | A2 | 92 659 38 | 151 546 28 | 69 113 7 |3 | A3 | 59 929 47 | -124 143 12 | 289 786 84 |4 | A4 | 51 798 50 | -322 612 69 | -178 186 28 |5 | A5 | 14 799 60 | -552 369 55 | -596 430 84 |6 | B1 | 20 911 62 | 809 781 174 | -331 131 38 |7 | B2 | 50 631 47 | 177 346 21 | 161 285 22 |8 | B3 | 59 806 45 | 96 117 7 | 233 690 55 |9 | B4 | 81 620 41 | -197 555 41 | 68 65 6 |10 | B5 | 40 810 60 | -374 285 74 | -509 526 179 |11 | C1 | 44 847 60 | 597 746 203 | -219 101 36 |12 | C2 | 91 545 38 | 68 101 6 | 143 444 32 |13 | C3 | 57 691 48 | -171 218 22 | 252 473 62 |14 | C4 | 44 788 52 | -373 674 80 | -153 114 18 |15 | C5 | 15 852 60 | -406 202 32 | -728 650 136 |16 | D1 | 17 782 56 | 333 285 25 | -440 497 57 |17 | D2 | 67 126 42 | -61 126 3 | 2 0 0 |18 | D3 | 58 688 48 | -106 87 9 | 280 601 78 |19 | D4 | 65 174 43 | -61 103 3 | 51 71 3 |20 | D5 | 43 869 50 | 196 288 22 | -278 581 57 |

Subset version also

available,

using

nullified Burt matrix

as

before

Packages with CA

• ca

• FactoMiner

• vegan

• ade4

• MASS

• caGUI

• biplotGUI

• …

I - 0Correspondence analysis with ca

Correspondence analysis with ca

Tutorial presented at the CARME 2011 in Rennes, FranceFebruary 8, 2011

M. Greenacre, O. Nenadi


Introduction

In the practical part of this tutorial we demonsrate how to apply the capackage for simple, multiple and joint correspondence analysis in R.

R is a freely available statistical software environment. Since itsintroduction by R. Ihaka and R. Gentleman (1996) it has gained muchpopularity in the statistical community.

One advantage of R is the extension system, which allows for extendingR‘s capabilities by so-called packages.

Further information on R is available at the official R website: http://www.R-project.org .


The ca package, an overview

The ca package offers functions for the computation and visualization of correspondence analysis.

The core computations are done by the functions ca() (simple correspondence analysis) and mjca() (multiple and joint correspondenceanalysis).

Each function has its corresponding print, summary and plot method whichare used for presenting numerical results of the analysis and for thegraphical display.

Additional functions include auxillary functions that are usually not calleddirectly by the users (such as e.g. iterate.mjca() which is used in a joint correspondence analysis).


The ca package, an overview

The core functions in ca and its methods:

simple correspon- multiple and jointdence analysis correspondence analysis

- Computation: ca() mjca()

- Numerical output: print.ca() print.mjca()summary.ca() summary.mjca()

- Graphical display: plot.ca() plot.mjca()plot3d.ca() (plot3d.mjca())

Where applicable, the functions for simple and for multiple / jointcorrespondence analysis share the same structure of arguments.


Simple correspondence analysis

Simple correspondence analysis is performed with the function ca():> ca(smoke)

Principal inertias (eigenvalues):1 2 3

Value 0.074759 0.010017 0.000414Percentage 87.76% 11.76% 0.49%

Rows:SM JM SE JE SC

Mass 0.056995 0.093264 0.264249 0.455959 0.129534ChiDist 0.216559 0.356921 0.380779 0.240025 0.216169Inertia 0.002673 0.011881 0.038314 0.026269 0.006053Dim. 1 -0.240539 0.947105 -1.391973 0.851989 -0.735456Dim. 2 -1.935708 -2.430958 -0.106508 0.576944 0.788435

Columns:none light medium heavy

Mass 0.316062 0.233161 0.321244 0.129534ChiDist 0.394490 0.173996 0.198127 0.355109Inertia 0.049186 0.007059 0.012610 0.016335Dim. 1 -1.438471 0.363746 0.718017 1.074445Dim. 2 -0.304659 1.409433 0.073528 -1.975960



Additional details are given with the summary method:> summary(ca(smoke))

Principal inertias (eigenvalues):dim value % cum% scree plot1 0.074759 87.8 87.8 *************************2 0.010017 11.8 99.5 *** 3 0.000414 0.5 100.0

-------- -----Total: 0.085190 100.0

Rows:name mass qlt inr k=1 cor ctr k=2 cor ctr

1 | SM | 57 893 31 | -66 92 3 | -194 800 214 |2 | JM | 93 991 139 | 259 526 84 | -243 465 551 |3 | SE | 264 1000 450 | -381 999 512 | -11 1 3 |4 | JE | 456 1000 308 | 233 942 331 | 58 58 152 |5 | SC | 130 999 71 | -201 865 70 | 79 133 81 |

Columns:name mass qlt inr k=1 cor ctr k=2 cor ctr

1 | none | 316 1000 577 | -393 994 654 | -30 6 29 |2 | lght | 233 984 83 | 99 327 31 | 141 657 463 |3 | medm | 321 983 148 | 196 982 166 | 7 1 2 |4 | hevy | 130 995 192 | 294 684 150 | -198 310 506 |



Extensions to simple correspondence analysis include supplementaryrows and/or columns as well as a subset analysis.

These extensions are handled by the optional arguments supcol / suprow and subsetcol / subsetrow :

# Considering the first column (non-smokers) as supplementary: > ca(smoke, supcol = 1)

# Considering the subset of non-smokers (i.e. columns 2,3 and 4):> ca(smoke, subsetcol = 2:4)

# Adding a supplementary column to a subset analysis:> ca(smoke, subsetcol = 2:4, supcol = 1)



The visualization of simple correspondence analysis is done with thecorresponding plot method:> plot(ca(smoke, supcol = 1))



As with the core function, additional options are provided by optional arguments. For example, different map scaling options are available withthe option map :

option description"symmetric" Rows and columns in principal coordinates (default)"rowprincipal" Rows in principal and columns in standard coordinates"colprincipal" Rows in standard and columns in principal coordinates"symbiplot" Row and column coordinates are scaled to have variances

equal to the singular values"rowgab" Rows in principal coordinates and columns in standard co-

ordinates times mass"colgab" Columns in principal coordinates and rows in standard co-

ordinates times mass(according to a proposal by Gabriel and Odoro , 1990)

"rowgreen" Rows in principal coordinates and columns in standard co-ordinates times the square root of the mass

"colgreen" Columns in principal coordinates and rows in standard co-ordinates times the square root of the mass(according to a proposal by Greenacre, 2006)



In addition, three-dimensional maps can be displayed using the rgl-package (D. Murdoch, D. Adler):> plot3d(ca(smoke))


Multiple and joint correspondence analysis

Multiple and joint correspondence analysis is computed with the functionmjca().

The approach to MCA is determined by the option lambda:

lambda=“indicator” Multiple correspondence analysis based on the indicator matrix

lambda=“Burt” Multiple correspondence analysis based on the Burt matrix

lambda=“adjusted” Adjusted multiple correspondence analysislambda=“JCA” Joint correspondence analysis

By default, an adjusted MCA is performed, i.e. lambda=“adjusted“.



The input data for mjca() is a data frame comprising factors as thecolumns (response pattern matrix).

Internally, computations are performed on the Burt matrix (B), which isobtained from the indicator matrix (Z).



An example: A multiple correspondence analysis on the wg93 dataset (i.e. four questions on attitude towards science with responses on a five-point scale):> mjca(wg93[,1:4])

Eigenvalues:1 2 3 4 5 6

Value 0.076455 0.05822 0.009197 0.00567 0.001172 7e-06Percentage 44.91% 34.2% 5.4% 3.33% 0.69% 0%

Columns:A1 A2 A3 A4 A5 B1 B2 B3

Mass 0.034156 0.092423 0.058553 0.051091 0.013777 0.020379 0.049943 0.058840ChiDist 1.343394 0.676433 0.947274 1.049164 2.214898 1.856041 1.034203 0.933288Inertia 0.061642 0.042289 0.052542 0.056238 0.067588 0.070203 0.053417 0.051252Dim. 1 1.836627 0.546240 -0.446797 -1.165903 -1.995217 2.924321 0.641516 0.346050Dim. 2 -0.727459 0.284443 1.199439 -0.736782 -2.470026 -1.370078 0.666938 0.963918

B4 B5 C1 C2 C3 C4 C5 D1Mass 0.080654 0.040184 0.043628 0.090700 0.056544 0.044202 0.014925 0.017222ChiDist 0.760011 1.294006 1.241063 0.688137 0.977789 1.148345 2.132827 1.915937Inertia 0.046587 0.067286 0.067197 0.042950 0.054060 0.058289 0.067895 0.063217Dim. 1 -0.714126 -1.353725 2.157782 0.246828 -0.618996 -1.348858 -1.467582 1.203782Dim. 2 0.280071 -2.107677 -0.908553 0.591611 1.044412 -0.634647 -3.016588 -1.821975...



As in simple CA a more detailed output is given with the summary method:> summary(mjca(wg93[,1:4]))

Principal inertias (eigenvalues):

dim value % cum% scree plot 1 0.076455 44.9 44.9 *************************2 0.058220 34.2 79.1 ******************* 3 0.009197 5.4 84.5 *** 4 0.005670 3.3 87.8 ** 5 0.001172 0.7 88.5 6 7e-06000 0.0 88.5

-------- -----Total: 0.170246

Columns:name mass qlt inr k=1 cor ctr k=2 cor ctr

1 | A1 | 34 963 55 | 508 860 115 | -176 103 18 |2 | A2 | 92 659 38 | 151 546 28 | 69 113 7 |3 | A3 | 59 929 47 | -124 143 12 | 289 786 84 |4 | A4 | 51 798 50 | -322 612 69 | -178 186 28 |5 | A5 | 14 799 60 | -552 369 55 | -596 430 84 |6 | B1 | 20 911 62 | 809 781 174 | -331 131 38 |

...



The different approaches to MCA are specified with the optional argument lambda:

# MCA based on the indicator matrix:> mjca(wg93[,1:4], lambda = “indicator”)

# MCA based on the Burt matrix:> mjca(wg93[,1:4], lambda = “Burt”)

# MCA based on the adjusted approach:> mjca(wg93[,1:4], lambda = “adjusted”)# lambda=“adjusted” is the default, hence the following # gives the same result:> mjca(wg93[,1:4])

# Joint correspondence analysis:> mjca(wg93[,1:4], lambda = “JCA”)



As with simple CA, supplementary variables are specified with the option supcol. In mjca() only supplementary variables (i.e. columns) are considered.

Columns 5 to 7 of the wg93 dataset contain additional demographic information (sex, age and education). These are included as supplementary variables as follows:

> mjca(wg93, supcol = 5:7)



The option subsetcol in mjca() referrs to the column indexes of the subset categories (i.e. the levels of the variables).

For example, excluding the middle categories in the analysis of the wg93dataset is done as follows:

> si <- (1:20)[-seq(3,18,5)]> si[1] 1 2 4 5 6 7 9 10 11 12 14 15 16 17 19 20> mjca(wg93[,1:4], subsetcol = si)



Both options, subsetcol and supcol, can be combined, i.e. supplementary variables can be included in a subset analysis:

> mjca(wg93, subsetcol = si, supcol = 5:7)

Eigenvalues:1 2 3 4 5

Value 0.070422 0.034998 0.007176 0.000875 0.00044Percentage 53.96% 26.81% 5.5% 0.67% 0.34%

Columns:A1 A2 A4 A5 B1 B2 B4 B5

Mass 0.034156 0.092423 0.051091 0.013777 0.020379 0.049943 0.080654 0.040184ChiDist 1.343394 0.676433 1.049164 2.214898 1.856041 1.034203 0.760011 1.294006Inertia 0.061642 0.042289 0.056238 0.067588 0.070203 0.053417 0.046587 0.067286Dim. 1 1.706316 0.544095 -1.307329 -2.435074 2.759360 0.850833 -0.569441 -1.710689Dim. 2 1.275991 -0.343625 0.201719 2.794810 2.003836 -0.658112 -0.533170 2.100918...

sex1(*) sex2(*) age1(*) age2(*) age3(*) age4(*) age5(*) age6(*)Mass NA NA NA NA NA NA NA NAChiDist NA NA NA NA NA NA NA NAInertia NA NA NA NA NA NA NA NADim. 1 -0.341876 0.328786 -0.405213 -0.243592 -0.033779 -0.030832 0.025808 0.666671Dim. 2 -0.130770 0.125763 -0.319599 0.305108 0.075773 -0.016810 -0.190774 -0.146837...



The plotting method gives the graphical representation of the result as a map:> plot(mjca(wg93[,1:4]))


Summary

The computation is done with two functions, ca() for simple CA and mjca() for multiple and joint CA.

The input data is a table of frequencies for simple CA and a response pattern matrix (i.e. a data frame with factors) for multiple and joint CA.

In mjca() the type of analysis is controlled by the option lambda.

Subsets and supplementary variables are specified with subsetcol and supcol (in simple CA also subsetrow and suprow).

Output (numerical and graphical) is managed by the correspondingmethods (print, summary and plot).

All available options are listed in the manual / help files.


The End

The package is available from the CARME-N website (Correspondence Analysis and Related Methods Network):http://www.carme-n.org

Currently the package is at version 0.50, the current version includes a major revision for the mjca-part, where all computations have been rewritten to follow a unified approach.

The next update will focus on the graphical output.

Feedback and suggestions are highly welcome:

[email protected] ; [email protected]

Simple correspondence analysis (CA), Multiple - CARME 2011

Documents