Top Banner
THE INFORMATION IN CONTINGENCY TABLES - AN APPLICATION OF INFORMATION-THEORETIC CONCEPTS TO THE ANALYSIS OF CONTINGENCY TABLES Fkz _SI_ tar by C. T. Ireland and S. Kullback -......... J;+ + ' : ........ .. ......... ... .... ....... ........ ......................... IT . ...... ....... .... ... - Technical Report No. 235 August 4, 1976 Prepared under Contract N00014-76-C-0475 (NR-042-267) Office of Naval Research Herbert Solomon, Project Director Reproduction in Whole or in Part is Permitted for any Purpose of the United States Government Approved for public release; distribution unlimited DDC DEPARTMENT OF STATISTICS -F STANFORD UNIVERSITY 1OCT 1 1976 STANFORD, CALIFORNIA n a D
84

The Information in Contingency Tables - DTIC

Apr 08, 2023

Download

Documents

Khang Minh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Information in Contingency Tables - DTIC

THE INFORMATION IN CONTINGENCY TABLES -

AN APPLICATION OF INFORMATION-THEORETIC CONCEPTS

TO THE ANALYSIS OF CONTINGENCY TABLES

Fkz _SI_ tar by

C. T. Ireland and S. Kullback

-.........

J;+ + ' : ........ .. .........

... .... ....... ........ .........................

IT . ...... ....... .... . . .

- Technical Report No. 235

August 4, 1976

Prepared under Contract N00014-76-C-0475(NR-042-267)

Office of Naval Research

Herbert Solomon, Project Director

Reproduction in Whole or in Part is Permitted forany Purpose of the United States Government

Approved for public release; distribution unlimited

DDC

DEPARTMENT OF STATISTICS -F

STANFORD UNIVERSITY 1OCT 1 1976

STANFORD, CALIFORNIA n a

D

Page 2: The Information in Contingency Tables - DTIC

The Information in Contingency Tables -

An Application of Information-Theoretic Concepts

to the Analysis of Contingency Tables

by

C. T. Ireland(1) and S. Kullback(2)

1. Introduction

The primary purpose of this paper is to present an exposition of the

methodology underlying the analysis of the information in contingency

tables. We shall stress the concepts, techniques, analyses and inferences

without entering into extensive technical statistical proofs or detailed

references to the bibliography at the end.

It is useful to note that we are concerned with an aspect of multi-

variate (multiple variates) analysis with particular application to

qualitative or categorical as well as quantitative variables. The basic

data we deal with are counts in multiway cross-classifications or multivay

contingency tables. Multiway contingency tables, or cross-classifications

of vectors of discrete random variables provide a useful approach to the

analysis of multivariate discrete data.

As we shall see, the analytic procedures serve to bring out various

interrelationships among the classificatory variables in a multivay

cross-classification or contingency table in many dimensions. Classical

problems in the historical development of the analysis of contingency

(1)C. T. Ireland is Professor of Statistics, George Washington University.

(2)S. Kullback is Visiting Professor of Statatics.

Page 3: The Information in Contingency Tables - DTIC

tablej concerned themselves with such questions as the independence or

conditional independence of the classificatory variables, or homogeneity

or conditional homogeneity of the claasificatory variables over time or

space, for example. Such classical problems turn out to be special cases

of the techniques we shall discuss. These techniques result in analyses

which are essentially regression type analyses. As such they enable us

to determine the relationship of one or more "dependent" qualitative or

categorical variables of interest on a set of "independent" classificatory

variables as well as the relative effects of changes in the "independent"

variables on the "dependent" variables. In particular such problems as

the determination of possible factors and measures of their effect and

interactions in the representation of the logits of one or more

dichotomous variables lend themselves to-the analysis we shall

examine.The methodology is based on the Principle of Minimum Discrimination

Information Estimation, associated statistics and Analyses of Information.

General computer programs are available to provide the data for the

inferences.

2. Contingency Tables

We assume that the reader has some familiarity with cross-

classificationsin the form of contingency tables. We use a

slightly modified conventional notation. For example, for a

four-way contingency table, that is, one with four classifications

or variables, each of several categories, not necessarily the

same in number, we represent the observed number of occurrences

in the (ijkl) cell of the contingency table by x(ijkl), where

the indices i,j,k,l, range over the respective categories of

tne variables. The corresponding probabilities are represented

ty p(ijkl). Summation over one or more indices, resulting in

various marginal distributions or marginals, is indicated by a

dot or dots, thus

Z x(ijkl) - x(.jkl), EZ x(ijkl) = x(i-k-), etc.,i ji

witn a similar notation for the probabilities.

2

Page 4: The Information in Contingency Tables - DTIC

We shall denote estimates under various hypotheses or

models by x*(ijkl) , where values of the subscript u will4Urange over the hypotheses or models.

AIn example of a 2x2 two-way contingency table is shown

in Taole 2.1.

Table 2.1

x(ij)j = 1j = 2

i = 1 x(ll) x(12) x(l.)

i = 2 x(21) x(22) x(2-)

x(1l) x(#2) x = n

The estimated two-way table under the hypothesis or model

of independence is shown in Table 2.2.

Table 2.2

x*(ij)

i = 1 x(l.)x(l/n xlx(2)/n l)

i = 2 x(2.)x(-l)/n x(2-)x(-2)/n x(2-)

x(l) x(.2) n

3

... .. . . . .. . . . .. > -

- - - - - - - - - - - - -•-

Page 5: The Information in Contingency Tables - DTIC

A co..irnon statistical ueasure of the assocJatLon or interactioa

1etween the variabls of a tw.o-w4ay 2x2 contingenc.y table i th2 cruis-

product ratio, or its logurith-a. The cross-product ratio is defined by

(2. )x (11) x (22)x(12)x(21)

though we shall be more concerned with its logarithm

(2.2) fn% x(ll)x(22)

x(12)x(21)

We shall use natural logarithms, that is, logarithms to the base e

rather than common logarithms to the base 10, because of the nature of the

underlying mathematical statistical theory. Note that with the estimate

for independence, or no association, the logarithm of the cross-product

ratio is zero., . x(l')x('l) x(2")x("2)

(2.3) Zn x (ll)x (22) _ 4n n n 1 = 0x*(12)x* (21) x(l-)x(2) x(2-)x(-l)

n n

The logarithm of the cross-product ratio is positive if the odds satisfy

the inequalities

> Ll(12) or LL) >x(21) x(22) x(12) x(22)

since then we get for the log-odds

en x(lZ)n(22) -x(ll) n >r x2) > ox(12)x(21) (2.I) x(,22)

4in x~L!) - n x(21)> , 0

x(12) :.-(22)

The logarLhAn of the cross-product r-.tio it nug-ttive if the o.dds satisfy

the : - ua I ties

-- or _ _-x(2 1) x(22) x(12) x(22)

since tza-2. .;e ';2t for the log--oddr

4

Page 6: The Information in Contingency Tables - DTIC

Lo i ,: . 7., t I aitr m-

1) l u n

For tae th cc way 2x2x2 contingency table in addition to the

clasic, types ot ui.dpendetnce, interaction or association, there

arises ani additio'nal unet impirtLant historically and practically.

This is known as no tnree-factor

or no second-ordlar interactiarn. 11o three-factor or nio _,;econd-ordar

interaction imp]iies that the logarithm. of the associatrion measured by the

cross-product raL1o for aay twio of the variables is, thia sa-me for all the

valuds of the third variable, that is, there is no second-order interaction

if

x(121)x(211) x(122)x(212)'

(2.4) _ _ -- , 1~~~~~~ xt)(11- (12 2) x(1) -

(1 ~(l~ 2 2) _ ~_x( 211)-.(222)

On'e is coac.2rrI*2d .- it'! thle po03&rL~le 1yo1' or :liodal of no

ca~-or~ rintera 2tion uhen Pnone of the other tyicas o- inctenenca are

f o, md. yvri~n thii,3 Cjle, the correspondinj esv1eCannot 1-) ex-

pr3'4 0C! e:2-.iiitly in Lolr2:s or olbservedl maririnalz. IUCct estimate

13 un!ct-,'im 1 to liav i:. sac t,''3-!.ay mar-Ln:i L3 a,; t* ose cz~

oiird ttoriti; :j'oc.:dur-_±s ex:isttoet!,i,2-h

U:d~ ~'~hyY~~vc~o~r :Dclof no sec'0:hI-order Pr±nio.For thn-

;-rmnitcal ;iree--,iay r ;:< : t contingency t;!ble Lher-1 nvr. of C-earS--~. r-1-e

r,- L:iL. orb aron - Ltue 1,), cross-product ratios like Xi -)wich r-.u:-t bie

ju K i tC ~ :ie p rocedlurous to 'ltrIn.:t!c~ I xtC ~~e

5

Page 7: The Information in Contingency Tables - DTIC

For four-way and higher order contingency tables the problem of

presentation of the data increases, as do the variety and number of rues-

tions about relationships of possible interest and varieties of interaction.

The basic ideas, concepts, notation and terminology te have mentioned for

the two- and three-way contingency tables extend to the more general

cases as tie consider the methodology. For some additional prefatory

remarks see Au et al (1971).

3. Discrimination Information

To make the discussion more specific and with no essential restric-

tion on the generality, we shall present it in terms of the analysis of

four-way contingency tables. Let us consider the collection of four-way

contingency tables RxSxTxU of dimension rxsxtxu . For convenience let

us denote the aggregate of all cell identifications by P. with individual

cells identified by W so that the generic variable is W - (i,j,k,t)

i - l,...,r, j - 1,...,s, k - l,...,t, £ - l,...,u . Suppose there are

two probability distributions or contingency tables (we shall use these

terms interchangeably) defined over the space Q , say p(u), ir(w),

E p(w) - 1, E r(w) - 1 . The discrimination information is defined by

(3.1) I(p:w) E p() 9.n PM

The basis for this definition, its properties, and relation to other

definition3 of information measures will not be considered in detail in

this exposition. For the particular types of application to which we

shall restrict this exposition the 7-distribution, .T(w) , in the definition

(3.1) according to the problem of interest may either be specified, or it

may be an estimated distribution. The p-distribution, p(w) , in t'

definition (3.1) ranges over or is a member of a family of distributions

of Interest.

Of the various properties of T(p:!w) we rantion in particular the

fact that 1(p:w) > 0 and - 0 if and only if p( -(

6

Page 8: The Information in Contingency Tables - DTIC

*:r1' obI IQ.ll: ysi ;ot crn ' , , y ua charac-

0? .9 :,t i ~iLJ. i (I L,; t i ;' : L ~ a i t -bl s je t Lo

L: C. L J e. 2 r i i nt s and t h en-, coj7 yari .1 at. tc1 L1 e; witi an obse!rved

tableDl to Itrrn :ethcr LhK ob-ser\vcd ttblc sat is_-ies a nail hypothesis

or mod"1. iri Led by, Cae reaitraiir.ts. In accordnca rwitih the principle of

riinun, Oi cririinatli I afornatioa en t .uta.;e tLnn hat '.itaaa of

the collection or faT.ily of p-distributions sac isfyOL1; the r2straints

-hich nilizst~ciciniain[irain I( 7)over all members

of ti:,,fniyo pertinent p-distributions. W dranote thie rnininum di.s-

crirnin:.!tiDn inforrmation estimate by p GO) so that

(4.1) TO :) p~(W) R~) Xnin l n I)Tr (W)

Unless otherwise stated, "he summation is over iw~.hich will ba omitted.

In a wide class of problems which can be characterized as '1 sroothiag"

or fitting an obsetvecl contingency table the restraints specify that the

estimated distribution or- contin-ency table have some -,?t of marginals

-.hich arc rhz! same au; those of an obsorved continge-ncy table. In such

cases rT L.) is taken to be eiliher the uniform diatcihution -.'(jkZ)

1/rstu or a distributioni already estitnated subject to restraints contained

in and irplied by the restraints under ey:arinatioai. T--- latt-er case

includes,- rhe Classical 1hypoth.eS.s Of iridope ndenra, conu'itional ladepaidance,homoc ,enco.ry, condlitloaal. hoimooneity aiid initeraction, all of which cart be

considored 1-; [Ilst3.aces Of 1-11ralized inecdn~ and wiLl he conaidared

in somta &deaLl in Ci i; paper. BY L.rol!ralhized icd t~neis ;-'sant the

f:rcttlat Ltha estDiatL~s maiy be expressed asa ncetor z actorq which arc.

Iun-ccioir; of apocropri.:re :narginals. See Ku et al (1971).

j. L~r.i~:. usrrlri.iinction Inftormaition Statistic

fo .-. t wlictther a-n cbsc-rve~d contingency tam3 ii. coosis1'tei'C Wuith

tre iralI an LS 3 i.~ o d~l 13 rpvir.3nai'ed by LbsinietUicr ai-

a ~ ~ ~ ~ f':eLvnL1t2> 'r m-aeOf Ch .av2 . on ,t;-nc:

o s i t. aad C a niorn:-r 7 7 re a'-n' is,

12 I '19a a-: i~onst2tiotic. Foc n-oi* [on-il .:?c3:n I

7

Page 9: The Information in Contingency Tables - DTIC

r t.'J o - 1. a rt ii Z.atilsI

I: d; LoK i j ~9 r.io s)ot jti utis sritneesaiy o

othier kinds; of appi ia oris of th, general theory.

6. 2.nrunDiscrimtir-!tji IULorrmatio.; Lhoorem

Uie HJWe'es'it atheorem tzKicn is the basis for the principle of

Lii mL:'Ji~, rim Lfdt- it). fl Aorrnto sati tion anld its applications. 14

>;'~11~)eQ~tit il Z.! :7, re- Latcd to the cotntei:t of tis discuss Lon on

-n,.ra ly;, a; of c(Jllinq :v=5 <cyables.

Let us con.;, ide; We .m;a.; A2 rntioned in S-u tion 3 and !:he i-

cr iAl wi.nnf Wiirmn.r i at rodue ed in (3.1) . Suppeo .: ijw, for U*.-vnaple,

crht Irir aret> o. :ariy independont stat is~ics of inzerest define-d

7.2) i(P:) Z pGO r.

t'i ly (if P-ti Ci".V.atio Wh.' *- sai"LsO reitraitL:s

Page 10: The Information in Contingency Tables - DTIC

-Y TI(,p< =

(6.3) T

where e 0 2 , 0 3 are specified values, and n (L:) is a fixed distri-

bution.

If -,(T ) s;Iti.;iics the restraints (6.3), then of course the

r,Linimum value of i(p:.) is zero and the mininizing distribution 13

p (G) (.) More geLarally, the ainimum discrimiination information

theorem states that the minimizing distribution is given by

( e).p (T 1 TI(W) + T 2 T2 (u) + T3 T3 (W))i "4)(6.4) p (T)3 )

where

(6.5) M(t 1,, 2 , 3 ) exp (T1 TI( ) + t2T 2 (w) + 13T3(w)),v(w)

is a normalizing fauLor so that E p*(4 1 , and th2 T's are para-

meters which technically are in essence undetermined Lagrange multipliers

whose value.- are defined in terms of 1, 02 , a3 by

*01 ;A-I9 M(TiI 2T3

(g exp ( 1lTI(,w) + - 2 r2(O) + T3T3( Tl(W),T( ))/M(Tl9r2, 3)

E T1 (w)p ( )

* Z

2 9r n M(-rt , I )

= (: Cxp (rTI () + T 2 T 2 (w) + T3T3(w))T2(W)r(w)J V ' ,3

(6.6) ,= . "2 (G ) p ( W )

*

03 na3 n 3)

(Z exp(TrT 1.(, + T 2 T2 (w) + T3T3(w))T3((J))/A('1l, 2,T3 )

= T. (M)p (w)

We can now state a number of consequences of the preceding.

Page 11: The Information in Contingency Tables - DTIC

AWe n fe fit; L tl 'r 1 ) is a ::ut- l ;In (a\!;oiinL i1I I-filV o I

dit;t i-i u i. 1) : . r, -, by :( ) and A; tii ha ,; , ,irab .1 1.b a :;t tiF;t ici

prop ,i ic,-; of 01k[.U ;, o Li an O.po;&' tia! i.I ly ,iih in : ,,." all the

commuu and 'las;ic di,t rib';tions. We nay ,Iso writ (6.4) as

(6. i) n pT, (u ,-) - __ - n2 1. , "2' T3) T T 1 j(a + T2 T2 (W0) + c 3 T 3 ('.

n I T + T G) + T (M + r. TT(.1: , + L T ( ) + 'Tr (U)) + -3T3(

ii 2 3 3

with L 2 n V( 23) The regression or Log-linear expres.sion in

(6.7) for inp p(n)/T(c )) ith T!(w) , T2 (b) , T3((,) as the explanatory

variables nd[ 1 , t,, , 3 as the regression coefficients plays an L-mpor-

tant role ill the ansiy. -is we shall consider.

We note next that the minimum value of the di.acrimination inform.ation

(6.2) is

01 .o) i(p : C. + = e +2 " M( T21 22 3 3 1' 2' 3

.iere the 's are defined in (6.3) and the - 's are determined to

cs't isfy (6.6) Us ing the value in (6./) it may be shown that if p(Uw) is

any ;ic:her of the fari ly of distributions satisfyin: (6.3), then

(6.9) I(p:7) = I(p :7) + l(p:p

T1he pytlagoreAn typo property (6.9) plays an important role in the analysisSr ira .rc..atiao tabl .sq

Am "exp!rirent" has been designed and observations made resulting

in a mui.ti--dive;!;ional contingency table with the desired classifications

aad rat.dori,.;. All the information the analyst hopes to obtain from the

.exp,."r~innt is contained in the contingency table. In the proce.-, of

anrily:is, the i! i; L to fit the observed table by a minimal or parsimonious

number of par,,i etCr,; t c eerifng ou sone or all of the marg inals , that- is,

10

U

Page 12: The Information in Contingency Tables - DTIC

to 1j!k Gl. t, 1!'u, 11.ch o ti . tota j i i o( r2 i is cor:.. o :;'I in a summary

CoIO1: i ',i ;i" ot i of :2 I., Is . Lindeed, the re I at, iou-,h i p bcetueen th-e

con ept 1 j 'II ief I i I C0 )I ;S S C' i -It L f k) an d i c nt I- t- kA n- i cunt i:ngency

tabi cs; :.rrd Lhc tlU1e hem ~inasplay is evidlemet. inl the historrical

ceve1op w-'t.; ill thl ex t eisive lit rat u rc on the aiimily's is of contingency

tabl 10;.rhu:, ,the s in the prccld i g disc us.sion .. ill be the mar-

g'inals (fA intert-Lt. Se.e KuI et al (1b)71)

7.1 . he'() :nt i ()- n:. 'fIre T(w) fuzictions fr the_ R*':SxTxU

tab he turn out. tim be a bas;ic_ set 0f Sirmple fuol-Ct iDns and their various

products . Tfhu,; , for exiplc, the T(w) tunct ion aessociated WiUf the

onie-way marginal p (2 . . ) is

(7.1) T <(ijkZ) 1. for i =2 , any j,k,2

0 otherwise

since

(7.2) Ipijk .) 2~ijZ

Similarly theT 1(w) function associated with the one-way miarginal p(. .3.)

for exnmple, is

(73T (ji I for k =3 , any i. jZ3

since

ihus for the r:ssxtxi Ltble the2re art?

Y T R

(5 ~ ~~~~~ La f 1 ual ~c~n uixenf i S ijk ~ 1 = ,.s-1

(7.5)t-1) ri ely i nd.,pcndc~nt funictin T ( iJk) , -y 1

(u-i) lii'r"I y I ird 'eedent funcetions 'Y( ij kzh), 6 ,...,-

since, to:r examrpie ,

r R

(t: I , jk ) r

cc--]

Page 13: The Information in Contingency Tables - DTIC

C hiv hj t r1eriv :c t.J , the tut .0 t KOrt:..,1i to U ( - r,

Qa s , t, , a natiur k'". c l-'-.nic c (i r( 'd have selected

S , 8 - 1, y ., 1 or any ot!,cr ce'vt of valll.s.

Thu T(Ue) filt 0ion assoc i at d ,, Lh the two-way narginal p(Z.

Say, i' T ) (i j.'-) , ( ) since from :h,(. definition of T (jjk) andSI

T2(1 ii.k) it m.v be !en that

K S(7.6) a ijk) T(ijkL) - I for i = I, j 2, any k,Z

= 0 ctnrlise

and

(7.7) p(ijk;.) T R(ijkk) T S~ik, p(12..)

R S R

For convenience we s!lal write TR(ijkZ) TS(ijkx) = TRS(ijk) , etc. Thus

the T(ws) function associated with any two-way marginal is a product of

two appropriate functions of the set (7.5).

Similarly the T(to) function associated with any three-way marginal

will be a product of ttiree of the appropriate functions of the set (7.5),

for example,

R ~jk) I T U(7.8) Z p(ijk,) T2(ijk),) T1 (ijkt) T3 (ijkZ.) = p(2.13)

For cvovnience we shrill write T R(ijk.) if (i k.) T (i .t:.) = " ,e. t '

2 i-ilarly the T( .,) iunction assoia ted with any four-way mar-ginal

,:i1l be a product of fouir of the appropriate functions oC te sot (7.5),

for ex4Apie,

RS T U,(7.9) E p(ijkL,) i' ( i.,j .) TI (iJkZ) TI(ijkZ) T2 p(2112)

For corivenienct!y shall. write T R (ijk).) T S(ljkZ) T T (ijkk) TU (jl"Y = 6(2112)

RST ,racy:(ijk;')

12

Page 14: The Information in Contingency Tables - DTIC

.'e lu t that they'd. ilre a total of

( = (r-) '- (s-I) + (ti) + (u-I)N2 - (r 1 ( -) + (r-1)(t-1.) +- (r--3)(u--1) +- (s,-1)(t-1)

(7.10) + (s-1)(u-) + (t-.)(u--):5 = (r-L)( -l(t-1) 4 (r-1) (s-1) (u-]) 4 (r-1) (t-1) (u-1)

+ (S-) (t-l) (u-)

respectively, of the simple linearly independent functions and their

products tvo, three, four at a time. It may he verified that

(7.11) rstu - I = N = N1 + N2 + N3 + N4

These values of the number of T(w) functions (or associated tau para-

meters) appear as appropriate degrees of freedom in the analysis of

information tables.

7.2. The Estimated p (u) Values. In the usual least squares

regression analysis procedure, one first computes the regression coeffi-

cients and then gets the values of the estimates. In the methodology we

use we reverse the procedure. Instead of trying to obtain the values of

the 7's from (6.6) (which is possible) we shall first obtain the values

of the t!stiunates p (t) by a straig'htfonnrd convergent iterative

procedure and then dorive the values of the ,'s from (6.7). We shall

not discuss the details of the iteration here, as they are in the computer

program and have been described elsewhere. The iteration may be described

as successively cycling through adjustments of the marginals of interest

starting with the ,;(o) distribution until a desired accuracy of agree-

ment between the set of observed marginals of interest and the computed

inarginals hani been attained. See Ku et al (1971.).

7.3. The T Values or Interaction Parameters. From the definitions

of the T(:j) functin; in ection 7.1 it is clear that they take on only

the values 0 or 1. for each value of w. From the nature of the T(w)

13

- - - -

Page 15: The Information in Contingency Tables - DTIC

lunc t ion tI. st o ! rc :' cI cr lo',- Iincar :-li ion'; ( .7) will htve

witn a ingl, value whii(ll can be dteii:id ien there i l bit

O', With one aditir unknown vallno and so on the I's alrcady

d, t er iT. Thu..se n,,i Iunnown I va lues c-n b( Litn dot', ri ned T Pis

p-1cess ef su,ocessiVe eviluation is carried on unt,1 all the values of

ace deer-ined. 'ivy :ire also available as output of a general coin-

titer prOLa-n.

. r Ipic heo resen Lar ion

A useful graphic representation of the log-linear regression (6.7)

is given in Figure 8.1 for a 2x:2x2x2 contingency table. This is the

analogue of the design r.atrix in normal regression theory. The blank

spaces in Figure 8.1 represent zero values. The (ijkZ)-columns are the

cell identifications in the same lexographic order as the cell entries

for the estimates in the computer output. Column I corresponds to L

which is essentially a normalizing factor. Each of the columns 2 to 16

........ ti.le.....g v4alues of the T(') functions, coluln~s 2

to 5 those for the one-way marginals, columns 6 to 11 those for the two-

•ay marginals, columns 12 to 15 those for the three-way marginals, and

column 16 that for the four-way marginal. For convenience the colut.ns

are also arranged in lexoqraphir order. The tau parameter associated

S-ita t'ie T(.) function is given at the head of the coluan. The full

~r ,is .at ion wiI all the columns of Fig'ure 5.1 gen-.races the o .sscved

values. Thui- the rows represent

__i____ _ __ _i ij,,ij.(8.l) ;.n p(ijk2,)

= Zi j = L + ti T(ijk; ) + ... + TllI'l(ijk1' 1Ki Jk , ijkZ j k

+ ... + T (ijkZ + ... + T T iikZ)

we!re ,r(ijkZ) in the 2x2x2x2 case is l/2x2x2x2 and the numerical

values of 1, and the tans depend on the observed values x(ijkZ) . The

,esign matrix corr,;punding to an estimate uses only those columns asso-

ciated with the marginal:, explicit and implied in the fitting process.

'Tis i; a r&t4 Lction of the fact. that hLgher order marginals Jmply certain

14

p -L

Page 16: The Information in Contingency Tables - DTIC

W 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

i k ij Tik it jk II k. ijk ijT ik. Jk. Tijk1

1111 1i I 1 1 1 11 1 1

1 II I 1 1 1 1 1

1121 1 1 1 1 1 1

1122 1 11 1

1211 1 1 1 1 1 1

12 2 1 1 1 1

1221 1 1 1

1222 1 1

2111 1 1 1 1 1 1

2112 1 1i 1

2121 1 1 1

2122 1 1

2211 1 1 1

2212 1 1

2221 1 1

2222 1

Figure 8.1. Graphic representation.

lower order marginals, for example, the tw:o-way marginal x(ij..) implies,

by sumnation over i and j , the one-way margiuals x(.j..) , x(i...) ,

and the total n = x( .... . Thus the estimate based on fitting the one-way

marginals will use only columns 1-5. The values of L and the taus for this

estimate will be different from those for x(ijkZ) and depend on the esti-

mate xl(ijkZ) . Thus if we denote the estimate based on fitting the one-way

marginals as x (ijkt) , the representation in Figure 8.1 implies

x (11)1 1 k 1

In - L + -rl + TJ

+ TI k

nn£n L+ 1 1 J T1

(8.2)

xi(2222)Zn - L.

nir

15

Page 17: The Information in Contingency Tables - DTIC

la" "D b -oD ma

111, -.l~k io k l t Lt i i o I -4u io o e-4d o l m ,i y

No II1 t i[ Id 1 v itII I a I I i

NII

N 6i4 4-I I

N I I + I /I (co ll icm. I ? t ci )

I + ~'1i i

.41 ~ m. .'cdI Jig d i .j.4 f41 (,it :1tt r r I. i*' W i fri I o IF-~a

-t. .it t~ ill .ibI I it i ior rel.AL IVo rc-'iI4C-I ch , 44 1 Jrai,

I t i i 11 (1 fmtil :;,ri, It4~il ' )[41 I i I v f det -v Iv i I iI III by 11 , 014. f (it :

41 , -, (I Ur'A - " l ., 1 tit .1 I I i 'i , %vd Ir t It Lii:tt t!i olcc t rriricc-e:,, ct~t-

i I -JI L'* c . 'fliit, .1 [4.11Iy!; I. -; tit tll orm4. it I olI 1.~4 bi.:,d till Lhe JId.L4iI It ii

k. it d11 1w G.) (w 1 !i ti ho4i111 I~' llui hat I ml n !( I -i -tI 1.41 i hif o 1 A 1:.1 i

1!1,~ 11i. u cur r ta: p mi Ili', I (Ia . :ot, It (i I of ivon 44f).44 ii i fl 1 d xbL (W~) is the

16

sycub esIah OVDt doe. notISGfA9 - r.ducin

Page 18: The Information in Contingency Tables - DTIC

dji.- LI 2. J i 10, 1. T Ut --j t LJi us:. to I. L i Zof 1-1.i::-O c ~ I 1. LA 0 y Y ; re (I LI

It. , th. L , I LIEC

(x ~ ,., - .: (-< :r. ) + f: ( ::-

(9.1).L ;n-.T) 2[(-< :niT) + 2i1(x" )

2L( Xa 21 % :&a) + 2 ( b

with a correspondiag dditive relaLion for the asaociated degraes of

freedom.

In terms of the representation in (6.4) or (5.7) or Figure 8.1 as

an exoonential family, for our discussion, the two extreme cases are the

uniform distribution for which all T's are zero, and the observed con-

tiagency table or discribution for which all N - rstu - 1 t's art

needed.

Nleasures of the form 21(x:x ) , that is, tha comparison of ana

observed contingency table rith an estimated contin;ency table, are called

measures of interaction or goodness-ot-fit. ALeasur3s of the form

2-(.::) , comaaring -i'.o estima .ad contir-nc7 i ' l - are called mea-

sues of -:act, tIat is th2 ei:act o -: e air-r a in -2 sac E bu

no: in the ,: H zaui.±: in jUz oor x a X noU e tshat

2T(:::1.:) L7ts a nIul u "0 t '1,3 r1 t t . ... >z 1 Z! : .. a:ameters in

the repr2sa~.-.n o.'- the 'Thsa~r;ad cornv:4-".vc- :! :z su- no:: the

.r_ ........ r o o w .- _ st-it -. - _ ,- C .) :ra :;ciro -'d t ' = c'" be_-of

e :'"s is t2 ::. o. J o- oa S . n. 5 -1cly 2- : a

tests a null nypotdesis that tae values of the set of T parameters

in tne representation of tile estimateA table x*(,) but not in

tae representation of the estimatcd table x*() are zero and thea

numoner of tnese taus is the nu.'ber of degrees of freedom.

We sum-marize taie additive relationships of the m.d.i.

statistics and the associated degrees of freedom in the Analysis

of Information Table 9.1.

17

Copy avabl* to Mel)C doda iroIPVMA~ NiUT 1oaible mipredutiofl

Page 19: The Information in Contingency Tables - DTIC

TABLE 9.1

ANALYSIS OF INFORMATION TABLE

CompoM'nt dLh to Information D.F.

Ha : Interaction 21(x:x*) Na -

lib : Effect 2 1(x :xa Na -N

Interaction 21(x:x ) Nb

Since measures of the form 21(x:x a ) may also be interpreted as measuresa

of the "variation une~xplained" by the estimate x a the additive rela-

tionship leads to the interpretation of the ratio

21(x:Xa - 21(x:xb) 21( :xa)

(9.2) a = - _

21(x:x ) 21(x:x )

as the percentage of the unexplained variation due to x accounted for* a

by the additional constraints defining xb "The ratio (9.2) is thus

similar to the squared correlation coefficients associated vith -.0-al

distributions.

We remark that the marginals explicit and implicit of the estimated

table x a() :hich form the set of restraints H used to generate xa(w)a a

are the sanie as the curresponding marginals of tht observed x() table

and all. lower order i,:p1ied mar:ia1..;. TI may be shorw, that 21(:x is

a~ppoxiat-y ,a quadraric in the differenrces betweea the remaining mar-

4inals of the x(w) table and the corresponding ones as calculated from

the x ( ) table.a

Similarly 21( :x*) is also approximately a quadratic in the

differences between those additional marginal restraints in H but not

in Ha and the corresponding marginal values as computed from the xa("G)aa

table.

As may be seen, because of the nature of the T(w) functions

described in Section 7.1 or indicated in Figure 8.1, the t's are deter-

mined from the log-linear regression Equations (6.7) (see (8.2) and (10.3))

18

Page 20: The Information in Contingency Tables - DTIC

'3 ',.;. . ; F . .; ; ". (; , A variLy o0 stiati-+

ic;l A n ii 1trAA ~ riy , CUL LL tin- ae!y

Lt i,., . ' .::- qu ' ia dZ.t ,(,Li L 't r 1 . jr:I or quadratic&

-j' c! E: C ii' Iciil t .I,; of lo-, iL~ Wi. the ob ; rv,2d or

C'3L ti-; d v'i ,_2, "i'.~' rOF-ri l_ u L[ ,iT i d.i;,cri:zLn:!,t.iot, i-tor:antiort

citV.ato'' d its pr-)' -,!'ITas thus prvi L!es a uni f y i4 rclatioas!ip sirn::e

"u:iC !3L1J>I rwry s, ;za us qtf.'tdr, ri2 ,uppro:zi .- ionj of. the nini.um

isCrii i ct- i0IrM'r 0,1 statistic. I7e remarh that the corresponding

appco:c.ata X2 ar ' ;It g:mterally aldicive.

fUe rrltioa thr .2Dro:irations ia terms of quadratic forms in the

marzinals ,r the C'S "a.; a possible bridge conn :cting the familiar pro-

cedures of classical regression analysis and the procedures proposed here

to assist in understanding and interpreting the analysis of informatioa

tabl. The covariance matrix of the T(w) functions or the taus can

be obtained for either the observed table or aRy of the estimated tables,

as well as the inverse matrices as part of the output of the general

computer program. See (10.4) to (10.9)

10 - ne 2%2 Tab I

It way be useful tc rc xaaine the 2x2 table Lrom the point of

view, of the preceding discussion. The algebraic details are simple in this

ca3e and exhibit the unification of the information theoretic development.

Suppose we have the observed 2x2 table in Figure 10.1

x(tU) x(12) x (l.

Figure 1U.l

19

Page 21: The Information in Contingency Tables - DTIC

If we obtai:i the m.d.L . e tiriate fittiug the: l () .-. ,y marginais, the

generalizt-d hirdependen. c hypoLhti,;is isi the c Lv;;icil indepeondence hy-

pothes i,; an1 the n.inimu:' discrimination inforation estill te is x (ij) =

x(i.)x(.j)/n . The representation of the log-linear regression (6.7) as

in Figure 8.1 for the full model is given in lI gure 10.2. The entries in

the colunnns -1 9 2 ' I3

i j L I, 2 ! 3

1 2 1 1

2 1 1 1

2 2 1

Figure 10.2

are, respectively, the values of the functions Tl(ij) , T2 (ij) , T3 (ij)

associated with the marginals 61 = x(l.) , 0 2 = x(.1) , 83 = x(11)

and the cc1u'.n hcadcd L corresponds to the normalizing factor (the

negative of the; logarithn of the moment-generating function as in (6.7)).

We recall the interpretation of Figure 10. 2as the log-linear

relations=x(I ) L + T + - 1 3

nTr

Z __ n L +

(I0.i)Z, 1 ) L + -

rUT

n 2L(22)

From (10.1) we find

1, - ;.I, (x(22)/n/4) ,

T I= Zn (x(12)/x(22)) ,

(10.2) 2 = ,n (x(21)/x(22)) ,

:- 9.n (x(1L).x(22)/x(12)x(21))

20

Page 22: The Information in Contingency Tables - DTIC

or

I Z11 x~ 2-- ilX(22)

(10.3) 2 x(21) Zn x(22)

, Zin x(tl) I in x(22) - 9n x(12) - ,n :.-(21)

If wC call T thte 1.,aLri: with colulins the Column;: of LiiC desigLt matrix

of Figure 10.2 t!,L' is,

(10.4 ) 1 1 0 0

0 01 0 0 0,

and define a diagonal matrix D with main diagonal the elements x(ij)

that is,x(I1) 0 0 0

(10(5) D 0 x(12) 0 00 0 x(21) 0

0 0 0 x(22)

then the estimate 4,t the covariance matrix of 01 x(l.) , 2 = x(.1)

x(ll) for the observed contingency table is Z A2.1 where

-. 1 2 .

2 . 1 11-12

atnd A11 is I x I ,-&:" is 3 x 3 , -1=A..is I x 3 .it is

found tha't

/" x~l)x(2.)x(] ~.I ) x(. ) :(Il) x(2.)nn n

A\

x(l)x(2.) x( 2)

(L fli 1111n2

21

Page 23: The Information in Contingency Tables - DTIC

/ -. ) -I ,_I_ 1

x(2 x(22)2-_ + _+-

-'- xQ.) : (2l) x(22.(21) x(22)1 1 1 1 1 1 1 1- ] i

v:e ,r:rrk that thL Intri x i(l 1.9) Li- the ccvariance iatrix of tsLj -E'

in (i0.3). Similuh results hold in general and for -stimated

taoles.Note that the value of the logarithm of the croas-product ratio, a

measure of association or interaction, appears in the coure of the analysis

as the value of T for the observed values x(ij) , and that 7 = 0 for

x (ij) , the estimatc onder tile hypothesis of independeuce, for which the

representation as in Figure 10.2 'ces not involve the last column since it

is obtained by fitting the one-way marginals.

The log-linear relations for the estimate x (ij) are

*n x(1) + +I nit 1 2

( . x (1.2) L +

___ 1

Un" 2) L

n7[

Witer, tt., nierical vaiues of L , I ,2 in( 10 .10) depend on x and

di Ifer ft on the valuos in (10 .1)

The minimum discrimination information statistic to test the null

hypothes is or model of indeipendence is 21(x:x ) with one degree of free-

dom. In this case the quadratic approximation is

(10.11) 21(X:Kx) (A() ) 2_ + +

2 \x(1.)x(1?) )

22

Page 24: The Information in Contingency Tables - DTIC

Remembering that x (ij) x(i.)x(,j)/n , the rig ' t-ha d si-le of (10.11)

may also be shown to be

2 2/ x(i .)x.(10.12) X ((ij) -x).

2

the classical X -test for independence: uith one degee of freedom. Another

test which has been pro!rsed for the null hypothe,;is of no association or

no interaction in the 2x2 table is

- i xJ 2 -Zn2 11 1- .j(10.13) (Zn x(ll) + n x(22) -x(21)) (I- x(1) +(21) x(22))

which may be shown to be a quadratic approximation for 21(x:x ) in terms

of T3 with the covariance matrix estimated using the observed values and

not the estimated values. We remark that if the observed values are used2

to estimate the covariance matrix then inst id of the classical X -test in

(10.12) there is derived the modified Neyman chi-square

(10.14) ×2= r (x(ij) - x(i.)x(.j)/n)-/x(ij)

11. An Analysis

In ordhr to coordinate and relate the variou.i delinitions, concepts,

parameters, ccnoutatiunai f _I ur&:s, L.to. discussei n the preceding sec-

ticns w tha- Onder u detai- .e _nalys Eu of -ipeclfic cctiiretz':

table.

Table 11.1 is a four-way continzeucy table A 1-,,053 men in a

training program , cross-cla:Lified on the \'ari_.Dles ho:e region

level )f education, racu: and program co; 1recion. ;e denote the

occurrencei in the four-way cross-classification or contingency Yable 11.1

by x(ijk2o) with the notation

Variable Inde x 1 2 3 4

Home Itegion i East North West South

Level of Education j Below V.S. H.S. Above II.S.

Race k White Non-white

Program 9. Failed Passed

23

Page 25: The Information in Contingency Tables - DTIC

r-I

H1

Kr,

,f 00

LC4

24

A 0

n i ) 0

Page 26: The Information in Contingency Tables - DTIC

v. ... I *if !1) 1 i* Ln 2, i' l ;I, i .. p

'7'] 1i[> it I v .Z1 it A )n tit. ii ii ';4nif i-

C it * tA C. -I. iO -. 1. L 0 i '~:'!(! of 1i'i.Ij InM

di~" WA it ixi~ not iA wl ''o i nIWw SUS 01t MUCCIt

'~~' i L~~ 0~tt 0~!. . f mIIr, Li,11

p*'i t ly w mi- u .: nsrQ 2Y nJi w2 the pia-c.ding

CSL i 1W nI z: : ii Nut W' do.rmiin. It- cff..wt of th:c dddi-

ii:~ .> r I- ' :,, 2i 'r ioni I iii piiramters. The

auillyVslO' o1 ,2 I I 2. :!:Li;In J "-r:I i Z 0 t1 o L he signifciie or

non-sit.':t ic igc,: tK ao ef M L .or interac ioen tiLl parameters.

Pi.. itt iu, viL~ NEtS WV MA0gina0S. SUMr we are interested in

the c.n h~I <. . lcn:in Lrdaining c n region ,level

Of Ciicitib, iod racc, .c lirzt: fit thV ii.!tar-its x(ijk.) , x( ...

o:,,Cd ti S< Lc K:01 a osyt P x (i x ( xi jt- .) . ) /n is that

un..cr t.h, W I f vpoLW.. I or n~udolit i. ric-oen e of s uccess and the

jointi voni Wo 2-,* c Jiof , IcvI ()I tduc it iOn, rIce-) or no inter-

aCt. ioO- bUcIdt i .. )i Iec- IOU, 7~t olt v:ib 1-. In otwr words w'- first

want to WIt. -0 r t-ie 2-4 col -. a of TAW 1. 11, i.wrC homogeneous

oi rA r i t111 under2 prial)i i a. pas,,io g - r f a iling.

flie~ ~ :~.cti d~ i atistc IC S

with [I Ia " n Z f urnf~. .. FJ- Cic hixicn;ctindependence or

I'. i~111t.,i. , I riulr. Sh'il l tok tot t .lintr ffec(-ts.

liiVA ai: 1.re i. L'; he iok)Mpt'tc citiit or toae log-

liioa r The.ct t n- n iprnaa o fur tLe ti etinjffL~CI of join 1t

26 - 3 1. co9 1i r i i i t,' I I t.Iic rn.ir,- i ia ls exp I iQ i t anid inqp Lici t in the

f i ttcd !i r-,, i i I Con,,iL r iitz; . We Call a1so iliL.2 prot 21 (x:x )as testing

a null ipdi.Usor t 1 it.i tit,' 23 ti piuiu o: in the ropresenta-

thon of 1). 1 ni:ot in: -Ire li i ktit. L;, t p.ir.ieters; corresponding

to ... (1 ' , 32-,6.

25

Page 27: The Information in Contingency Tables - DTIC

I . 1~ t ~

I I - __

...

I I-'

I - __ ___ ___

I I-. -~ I a

I i a

- I-. - .. -. -. - -.

p 77K-, -. - -.

I *-~- K -4-

* I

* I * -I

- 1.-. . .. I . I I

-~ I.-.-- . ~.. i I

I I...-. -- . I

- - -. I _

-.-....- -. . -. .. ..>t+-. L>; I.. . - * -. - - - -

I -'-*-'- '--4 -4---

....I - - ... . - -. .. -. -

~1___

26

Page 28: The Information in Contingency Tables - DTIC

A

1" ' V ,. ,f i'I .:.. ) I:; ) l.argt. that o,, rej.'ct the riodel of

joint ijtA,, , . t r f , • l , cr ,i tI) fi, :TC0, .1 , of nested

riarg inat. I i t' 11 , i X( ijk.) ,a,, v..io .; ci)"-, ato s; of two- and

thcue-way "',2 j i n l:a ; , otit inin.g so, t , witi ot li,. vr I ;11) 1 s. W1e summa-

rizo Iome iLe l!i ; ill i tuct ; '..mty:i:; of [Liformaton Table 11.2.

W:e have not aiiiludcd al the intrn, i te fitting soqnences for concise-

nIss. V, r' . , tlhit ilthough tli: c .o ure of the effect of additional

rIarja ina I t., ;Ir L:v;<,iatd par;imeLt:rs may vary according to the

scquencc in ci oh thty hive been addod, significa:nt effects tend to

remain s ig.n ii iaut .ian non-si nlfcarL effects tend to -,Lay non-

significant ,) tiit the tirst overal I survey should determine the

estimates ,id interactian parameters which warrant further investigation.

For example, the effect of adding x(..ki) t c x(ijk.) , x(i..Z) ,

x(.j.Z) is given in Analysis of Information Table 11-3as 21(xf:x) -

1.410 with One degree of freedom, but the effect of adding x(..ki) to

x(ijk.) , x(ij.:) is given in Analysis of Information Table ll.2 as

21(x*:x) 1.239 with one dugree of freedom. In neither case is thee m kP

cffcct or tle corespoodng cau parameter -i1 significant.

The calumns of Figure 11.1 which occur in the log-linear repre-

sentations of the estimates retained in Analysis of Information Table

i. 2 are

MIr.,inals Fitted EiLimate Columns of Figure 11.1

x..; 1-17, 21-22, 26-31

x(ijk.) , :(,i. .), x , !., ') 1-24, 26-31

x(ijk.), x(ij.,) x* 1-24, 26-37x (ijk .) , > i . ) ( . )x * 1- 3 7

FroM the anal ytic form of the log-linear representation or by

taking difftriIces 01 op,),opriate rows of Figure 11.'within the columns

used for thy es timate, th log-odds of fail to pass for each of the

estimates ar' jwiven by the respective parametric representations in (11.1)

where tie sup,!rscripts relate to the variables and the subscripts range

over the po;:Jible indices. The values of the parameters depend of course

on the corresponding estimate.

27

Page 29: The Information in Contingency Tables - DTIC

TABLE 11.2

ANALYSIS OF INFORINATION TABLE

Component Due to Information D.F.

x(ijk.), x(...Z) 21(x:x*) 160.551 23

a) x(ijk.), x(i..L), x(.j.z) 21(x* :x) = 138.732 5a

21(x:x*) 21.819 18a

m) x(ijk.), x(ij.X) 21(x*:x*) = 7.384 6

21(x:x*) = 14.435 12m

e) x(ijk.), x(ij.Z), x(..ki) 21(x*:x*) = 1.239 1e m

21(x:x*) = 13.196 11

21(x:x ) - 21(x:x a) 138.732 = 0.86

2 I x x2 1 x x1 4 . 1 = 0 . 9 121(x:x*) 160.551

21(x:x ) - 21(x:x ) -146.116

=*=-0.91

21(x:x 160.55121(x:x )2(x)

21(x__x_)_-_21(_:_e) 147.355

*• 160.551-092L(x:x )

TABLE 11.3

ANALYSIS OF INFOR.tATION TABLE

Counponent Due to Information D.F.

a) x(ijk.), x(i..Z), x(.jil) 21(x:x*) - 21.819 18a

f) x(ijk.), x(i..Z), x(.j.Z), x(..ki) 21(xf:xa) = 1.410 1

21(x:x*) - 20.409 17

28

I -A .. ...

Page 30: The Information in Contingency Tables - DTIC

x (ijkl)

x ij ,2)j

x (ijkl) i 9,. rj ijz

(11. 1) Ln t + +x*(ijk2) I ii . Tiijl

LU

*e(ijizl) + + +j r +t

T + =__ + + + ij+

x (ijk2) 1 1I ji 11 ijl

We recall that parameters with indices i = 4 and/or j 3

and/or k = 2 and/or Z 2 are by convention set equal to zero.

We remark that x m(ijk£) , determined by fitting the marginals

x(ijk.) , x(ij.Z) , is expressible explicitly as

(11.2) x (ijki) = x(ijk.)x(ij.Z)/x(ij..)

and is the estimate under a null hypothesis that race and success are

conditionally independent given home region and level of education.

In Analysis of Information Table 11.2 the value 21(x:x ) 14.43512 degrees of freedom, indicates an acceptable fit of this model. Fur-

thermore, 2(x* :x*) = 1.239 , one degree of freedom, implies that thee i

additional eofect of the marginal x(..kZ) is not significant or that

in the parametric representation of the log-odds in (11 .1) the parameterki

T l measuring tao effect of race on the dependent variable success isnre

not s ignif icant. 1t'e therefore investigate the estimate x min greater

detail. T"he values of (ijkZ) are given in Table 11.4.

* £In the expression for the log-odds under x in(l 1.1) T I is

an overall average, -ii and TIj are the effects of home region and

level of education on program completion and T ij is the interaction

effect of home region x level of education on program completion.

The numerical valuLs of the tau parameters are given in Table ll.5.We

recall that by convention parameters with an index corresponding to

i = 4 and/or j 3 and/or L = 2 are equal to zero.

29

Page 31: The Information in Contingency Tables - DTIC

u

c

U)

-o 0 1- 1

0'

- 4 C,1 '0 -4

.~ '0 CN -40 00

cn 0.0 - CD Q .4 (z -4-4 -

C14 - 0

co 1-1 co C4

Co 0

-4 cL 0 '

~ ' -4

-4 ODan -4 '0T4

fn 1 004

0 -4 C)

o '0-4 c '-4 co1 0 C14 CI

A44 V4 -1a

=C-4

300

Page 32: The Information in Contingency Tables - DTIC

TABLE 11. 5

VALU!-'S OV ',ARALETERS IN LOG-uODS FOR 'JN (11.1)m

= 44347 i z -0.292!78iIn

Til= 0.728653 ti 1 = -0.689433

i9. ij 2T = 0.041549 i21 = -0.602435

Ti -1.632427 • = -L.00304531 221

T 1 .312903 T ij,= 1..13793211 311

j z ijkT21 0.648130 '321 = 0.360697

From the pararmetric representation of the log-odds in (11.1)snd

the values In Tablu 11.5 one can determine differences in the log-odds

associated with changes in various categories. Thus the differences in

the log-odd' (fail to pass) as one changes the home region , for fixed

level of education,are given by

E-N E-W E-S

Below H.S. 0.9970 0.7287 0.4362

H.S. 1.0007 1.3110 0.0392

Above H.S. 0.6871 2.3611 0.7287

The diffe-ences in the log-odds as one-changes the level of education for

fixed home region arc, given by

Below H.S.-H.S. H.S.-Above H.S.

East 1.0617 -0.0413

North 1.0654 -0.3549

West 1.4420 1.0088

South 0.6648 0.6481

For eatiret interpretation, we convert the log-odds values to ratios

of the odds of failure.

31

Page 33: The Information in Contingency Tables - DTIC

F/N F/ [/4d'

Below uH.S. 2.7 2.1 1.6

H.S. 2.7 3./ 1.0

Above i.S. 2.(0 10.6 2.1

h lUwVt{.S./it.S. H.S./Above H.S.

East 2.9 0.96

North 2.9 0.70

West 4.2 2.7

South 1.9 1.9

Note that the odd, of failure in training of a man with home

region East and Above H.S. level of education are 10.6 times the odds

of a man with the ,arie level of education but home region West.

Men with home region East or North but with level of education

H.S. do botter than men with same home region but Above H.S. level

of education.

We have also computed the odds of failure x m(ijkl.)/x (ijk2) and

listed the results iii increasing values. The odds are expressed to 1,000,

that is, 5 to 1,000, 6 t' 1,000, etc.

Home region Level of Education Odds

Wes t Above H.S. 2

West H.S. 6

North H.S. 9

South Above H.S. 12

North Above H.S. 12

South H.S. 22

East H.S. 23

East Above H.S. 24

North Below H.S. 25

West Below H.S. 26

South Below H.S. 43

East Below H.S. 67

32

Page 34: The Information in Contingency Tables - DTIC

Note that tile O%OraV l ,'[- of 'li)uII ( O' t i' . rr& 311 I37L =

0.0226~ or 23.

For ease of compi ri-,;on and inf er'ict,, wo, aI.;, lis t t he f'aregoing

re';ult,; by home region and ltewl Of edu, itioii.

}West North South East

Above H.S. 2 12 12 24

H.S. 6 9 22 23

Below H.S. 26 25 43 67

12. Ou i [ zr-.s

'e d?'ine outliers as observations in one or more cells of a con-

ingecy tobl,,~ which appazrently deviate signficantly from a fitted model.

I.e- e ctlicr3 ray L,.,ad one to reject a model which fits the other.

ur_,vra,.tics. l- ex- c, in multi-dieznsional contingency tablas in

.lica tieor a.e i o. of the classificaitons there ,.iay occur an age

It cL Kuit a r -: nay be rejected for the entire tasle ut a model

'ai.in- Lhe pLhL ; tbJI - J.o 10. fet inito account nay liod to an acce.Prable

P !3-t ij Lil. t i.T" ( t( e i oi: , I

lit ol,_r casc .; ,n though a model s cs to f t, the outlieri con-

t ri bute t"(az r.-ire t L - .,-a+,njb Ie to the r'e":ure of deviat ion between the

I aroL!'.d t. r t 'i -1 v i 1: ,:;S of the mI-L. I.. in nt l,_-r ,.ord tle outliers

Ike Up a , rye perconta;e or the "une:-.pl ined v.'riat iou' 2 (:x )

A c,. to po,. ib - outlier s is p rov Lded by tIw ut p,,.,t of the con-

'tscur [-ogr I. ri th,- wic'te" otput for each ,tifate five entries are

33

t.

Page 35: The Information in Contingency Tables - DTIC

f . ...n:.

t)L.~ L t1 .1 C I) it. .L1 1 ""! r o r Mac

V ~ ~ ~ ~ -ft th ILK12 : : t I 7 i I t t.,rp r , t it. t) l ,l~ . La

ini rg,,iti ilt ti I t nL t' Lte ir. i:riiiaiuin info--

na t- li ,:I cPa i'Le C-L; t i t Maie ;ia rg inral res tra i its a s x except

ti-ati thct tic ;'e' x( ) s'! Ia ill ziuden, so 'A'ar >b (w I X(w 1 )

Pic b,;.;L, !:tL vvL' jy erry ol HLi :'iniinui discr~iiiaLiun infirratiofl

2 1Q:X 21(v :x) + 2 1(-)

0or

2 1 (x x a -1(~b 21(Q% :x)

liiero iI i :.i :.:,i t:i z ed i n th e Analys is o f lif 0rmatLion Tab le 12.1

TABIL 12.1

z".XLY c I S OF' I,~g v T&J; LE

COFIpontnc. du-2 to Ill foC)ri' t ionl D. F.

2[( 1 b Nh~

a it! 1? um l-ti~ X(, X )

it * itt' .r: r'';i:

34

Page 36: The Information in Contingency Tables - DTIC

/ (' / ,1%),... .)

(12.2) "

, n- "(, ,'L)-- (. - % <,,} Z. - ,,•n

n - a 1)

we 'Ct 1v ; (12.1)t L: i.( ( < )) EXb(

(12.3) x -0 1 X (w

2(.( ) . - - + (n - x(u 1 )),n n-/x a(I 1 ) n Xa (i)

The last vaue can be computed and is listed as the OUTLIER entry for eachCell 01 tI' corputer Output for thle estinate x a We remark that a

separate outlier computation for each cell is time consuming.Thc rat io

21(x:x - 21(x:x 21(x. :xa b nD I(12.4) 2I(x:x ) 2(x:x. )

a cl

then indicates the ,<:r:>1nta , of the "anexplained variatrin" due to the

outlier ;l1."

We shall iilus;trate tie outlier procedures and analysis using

datd originally (jtven by H.F. I)urn (The relationsiiip of cancer of

the lung ana tile use ot tobacco. Airs,-rican Sttistician, 8(1954),

7-13) and ,IJalyzed Luy J. Cornfield (A statistical problem arising

from rutrospectivc studies, Proc. 3rd Berkeley Symposium 4(1956),13b-14 ).

35

Page 37: The Information in Contingency Tables - DTIC

tu I it 111k ), w. iu r. a r I i ( I Ii t I (4 n X L: v

La(3X d t, u a s t r (,- 1 idua L wI I hk Ii i I i1 i: JV,( i prC r m tr i I y

k ~ ~ dh i >1 ia Is tIor r ardim fwj remn Ll( ii j, ~L flu 1. 1

di L tri bi Lion . fi' i I LISLLatU.; tL:J 1 CUpI12,1 11, an I y'; i S Wit 11 ti I

Citj a ori(ji-_1 IlIy q.i\7t:fl by. Lorui Aiid in clIyzu2d Ir: Co )f t ield'2 as

In Tablel2.lare listed the obse-rvatLions from 14

,:ci-rosp~ctiva studies on tho possible associa tion between

5[Oiol4._ and lungj cancer. lie denote the nccurrences in

the_ t'hirea-.iay I 'I>2x2 continci-3acy table hb x (Jjk) wi t

tho2 notation

VariIbe Incex 1 2 3 ... 14

S uLI-Iy i "o. 1Io a i..N.1

P a L f:,'Ita Con tro I lui cn~cr

Smoking I ,l~oiismoker Soo

D,-)--; Lh is3 cata li.i,-,sOcci aII-JOn hbe *:-n :;rokiniq and lung

canrcer, and if sro, i:.- the da;s-rujatLion ho;7cvrcOus ove:r

th, 1 -' zLu,.ics? Ucethe am !0 a;sia o is iho

1 L -c;h of Lh2 croz73-prcOIuct ratio.

36

Page 38: The Information in Contingency Tables - DTIC

'J',<Llo_ ]2. .

-c n t s,

(- t-- 1 -)--t-.ic nts

S72 8 3

2 4i 227 90

3 ) 41 i i/ 129

4 125 397 1) 70

5 3.1 299 U2 412

6 114 666 8 597

7 12 174 5 988 61 1296 7 1350

9 27 106 3 60

I0 0.1 53". 12 459

-2 7211 54 2, , 72412 56 462 9

13 4536 1725 -

14"2-9 " 260

37

Page 39: The Information in Contingency Tables - DTIC

i e tiy pt ll i .', (, c1U . . LI,]C_)t i ut' .- *; i 'e.' ti'.

L Ltul y

(12.5) l : p L- p(i-k)

p( ") p(i. -) p(i" )

imposes the restraiits on the estimate x*(ijk) thaLa

(12.6) x*(ij.) = x(ij-)and x*(ik) x(iK)d I

in fact x*(ijk) may be exlicitly represented bya

(12.7) x*(ijk) = x(ij-)x(i'k)/x(i--)a

Similarly the hypothesis of no second-order interaction

(12.8) H2 : p(ijk) = a(ij)b(ik)c(jk)

imposes the restraints on Lhe estimate x*(ijk) that

(12.9) x*(ij-) - x(ij. , x (i-k) x(i-k) , x* -jk) = x(-jk)

The estimate x*(ijk) cannot be reprt:sented as an explicit

function of the observed ;argqnals.

In this s tId-, th: ri -isrAt i-1 ) t -Ion info 7'a ttionsatizt:Lcz are o.-]iVce] i. ratio c. , .r,,; anl the

a',,;['.t ic c:)

t:;O a ,~ ;:nll'; is or n olcot i: '.oi tao! :r.Ltsta :.

te:', t":2:' )o< n! .-o -f t- ! L , o t 1. It , ,.ct

38

Page 40: The Information in Contingency Tables - DTIC

).( p) (. >) :. )~iC ~ O

(12.10) X 1 .. .. .2

,' a , . .1 tI VdI t il , r , I [( lu ., i c a n c e r

'Ind -' ar_ C v,.n c), Lh loq-lincar'" 2

x(il " ijtill I + I

(12.11)

+j kn x ( 2 -) - T + T i + 'L .I

,. cr -7 L e ~V a .II I of hc tau pa rarw tL rs demn d of c ,ucse

On

2n 1

I :.( i -.) ,:<(i .' ) 21 ( .: :: ) S c9.74 ],1

39

Page 41: The Information in Contingency Tables - DTIC

hile ""I II- of 2 [ Cx :x* ) when1 cu".'.jA0

~rc ni a: t- l Is i 1)tr It 1 *1 .. r IL

11 1) t 1" .!s is of 1 ci ' Io I)~I~td Le ter~ 1~ Uf

c'1nCer1 con()Pitiun2d ,'11 tihe t Iy s fi ae I i con-I itl tonaIl

I y po tne tSi - al I I c t InI a ( -ulr U 1, tjf o-u1 )f i i o., 1yn't io1 n f rof-

(litteret. StUdiezi ,.ttho)ut ilnposixiq the rejlulirement that

toeL popl±-L-ion characteristics of each stul' be similar.

The rejuction of tni conditional independence- hypothesis

-IS Of coulrse exp(-cted:. The degree of departure from

indepe2ndence is func-tionally dependent on the study. Is

this dependence the result of a small sub)set of the studies

whichi are suL)b!"ti~aliy different- from the remainder, or

does the dup'ArtAlre Vcry along a continuum?

Tiie, value, ut z i x. :x*) suggests that in accordance

with (12.10) the vaiLue of ijk 1 .687 is significantly

diffe-rent froin zero-. Moieover, 21(x:x*) is alzso significant2

when coitpared to thc -th percentile of -i d istribution.

Tevi Lue of 21 (x :,..) 5-juqqests thlat we rceect ther null

iiypotno-sis Of no second-oruer interacin tItis h

mo11)Iti with a conumon value of. the interactio-w param-eter t jk

Js not a good fit. The values of x* are clivt.n in Table 12.3.2

We now proceed to determine the outliers.

40

Page 42: The Information in Contingency Tables - DTIC

,,ntro1 P i t,: ; : 3 Canc,.r Patients

;t LnSl n Sl ,1: C,,-S: ck, rf Smokers

].0 ( 71 .99 99 83.01

2 4?.86 227.14 3.14 89.86

3 19.99 80.11 6.01 129.99

4 122.16 389.83 4.84 77.16

5 1_0.03 300.00 32.97 410.99

6 105.06 G74.94 16.94 588.06

7 15.47 170. 53 1.54 91.47

8 57.06 1299.93 10.94 1346.08

9 27. 15 105.85 2.85 60.15

10 85.21 529.79 13.79 463.21

i1 3.62 261.39 19.38 708.62

12 52 23 4J5.77 12.77 505.23

13 W15. 32 1721.66 31.69 458.33

27 , 3 259.16 5.17 259.84

41

Page 43: The Information in Contingency Tables - DTIC

'.x-minaLiun of the co:ututer oltput rcor x i usinq

1 2

all 14 studies showed a largest OUTLTi value of 18.14

for the cell (11,2,1). A new estimate fitting the mar-

qin: 1, x(ij.) , x(i.k) , x(.jk) and. omitting the cell

(11,2,1) was obtained. In fact Study 11 -aas o:mitted

because with the constraints for the new estimate

x.(ll,j,k) = x(ll,j,k). Since this estimate yielded

(12.12) 2I (x:x*) = 28.40, 12d. f.

the deletion procedure was continued. We suin'arize the

results in Table 12.4 and Analysis of Information Table 12.5.

Table 12.4

Fittinq x Lj.) x (i.k) , Y(. jk) wJi: siun i

d_i<i on of stucdes

-Cell V_1Ue I 1 ozration D.F.

I-l (11,2,1) 18]. 1 2(.:>: f 55. 19 13

I-10,12--1 (6,2,1) 7.89 20(:::. -2,.40o 12

1 ,7- LO,]2- 1.4 (4,2,1-) 4 .07 21(.::c 18 .03 1

1- 7, 1. ,1:!-]"4 (7,2,1) 3.9] 2 )--11.9,1 10

1-3, :J, --i , I 2'' .::j-) 7.03 9

42

Page 44: The Information in Contingency Tables - DTIC

TabL- 12.)

ALoalsins c t ior.-. t i,, r

C•zowiit dutonOi ion D. F.

All 14 studiei 2i(x:x2)255.19 13

Loss 11 2I(x*.)=2C.79 1~b 2__

2 I (x- *) =28. 40 12Less 11,6 21(x*:x*)=10.37

Less11,6c b2I (x:x c) :18.03 I

Less 11,6,4 21(x*:x*)= 6.08 1

2I(x:x*)=1.94 10

Less 11,6,4,7 2I(.*:x*)= 4.92 1e d21(x:x*)= 7.03 9e

Since (21(x:x*) - 21(x:x*))/2I(x:x*)2 e 2xx~

21(x*.x)/2I(x:x*) = 48.16/55.19=0.37 we see that the

four studies numbered 4,6,7,11 contributed 87% of the"unexplained variation" 21(x:x*) Tho values of the" 2

estii:ite x: are given in Tablel 2.b- The value of thee

log :ros;-procuct ratio is

-.x* (ill) x* (i22)(12.13) : n 1.53, i-3,58-0,12- 4x* (i12) x* (i21)

43

Page 45: The Information in Contingency Tables - DTIC

'ibLj 12.b

Ci i

tL'-c) V - -tr n t . 1 IuN; Cance r Patien-s

St y on . rk.. .. ... i .r rs Smo:rs-T. o,... r..--.g- ,o , s i o]- s

1 13.9 7:.32 3. 32 82.69

2 42. 6 227.53 3. 54 89. 47

3 19.40 80.60 6.60 129.40

5 126.85 303.18 36.15 407.83

8 55.79 1301.21 12.21 1344.80

9 26.00 S 106.20 3.20 59.80

10 83.61 531.39 15.39 461.62

12 60.81 I 457.20 14.19 503.81

1 639.35 1725.64 35.66 454.36

27.- 2 5.76 259.2"

44

Page 46: The Information in Contingency Tables - DTIC

We note thlIt Cox (1970) in analyzing the data of

Tablel2.1concluded that studies 8,6, aid 11 were outliers.

For the 14 studies he found a residual sum of squares 47.7

with 13 degrees of freedom. With studies 8, 6, and 11

omitted he found a residual sum of squares 15.1 with 10

degree3 of freedom. (Cox (1970) p. 83 gives the degrees

of freedom as 11, a misprint).

Following the procedure described when Studies 6,

8, and 11 were omitted the results led to the Analysis of

Information Table 12.7. Note that omitting Studies 11, 6,

4 as per Table 12.5 accounts for more of the unexplained

variation. Table 12.7

Analysis of Information

Comp-onent due to Information D.F.

All 14 stuies 2I(x:Y*)=55.l9 13

Loss 6,P,11 2I(xf:x)=41.62 32 1 (x: X*) =1 3. 57 10

Th-c sequun'ial procedure discussed herein was also

applied to data relating Iathar and son professions

45

F

Page 47: The Information in Contingency Tables - DTIC

pubLi ,LeCd by Karl Pear son (1904) , "un the th,,,o ty of

,x)lntii19elc da1 i 1 Lion Lo ass ItOS ,and normal

coirelLion," r,3printed in Karl Pear,.on ' s Early Papers,

Caibridqe University Press, 1948, and considered by

i.'ieniurq (1969) and Good (1956). Using the Pearson data

2Ficnoerg obtained an X = 184.9 with 44 out of 196 cells

2deleted whereas the sequential procedure led to an X = 155.3

with 25 cells deleted.

13. Zero Marginals

As may be noted from the analysis in Section 11, zero

occurrences in cells of the observed contingency table

present no special problem provided that no marginal entering

into the fitting specification is zero. When the latter is

the case, however, the interpretation may be distorted

because of inflated degrees of freedom. A procedure to

circumvent this problem is similar to that used for getting

revised estimates when outliers are indicated. We shall

present the procedure in terms of a specific example.

The following data resulted from a study of Christmas

tree consumption. We are indebted to Dipl. Forstwirt Dietrich

V. Staden, Institut f. Forstbenutzung, Universitaet Goettingen

for the data and permission to use it. In Table 13.1 are

listed responses to the question "Did you have a Christmas

tree in your apartment/house last year or not?" according to

size of household and size of city. We denote the occurrences

46

Page 48: The Information in Contingency Tables - DTIC

in the t1hree-waY -'x9x5 cOntililency tit I I,(, i k.) w i t h the(

notaLion

Var Lable Lndex 1 2 3 4 5 6 7

Tree 1. YcL No

l1ouzolld size J 1 2 3 4 1 b 7

City size k '<2000 2000 20000 10O000 500000to Lu to or

20000 100000 500000 more

For a 2x9x5 wxCxD contingency t-ible wa compute ,nn

estimate under a Iypothesit of no second-order interaction

by fitting all tho two-way marginals. C.ill this estimate

x*(ijk). A tost for the null hypothesis ot no uecond-order

interaction is given by

2LU. x(ijk) (n (ijk) 21(x:x*) 32 o.V.

If there is no second-order interaction thun the associations

between H and C, R ind D, C and D are the L;ame for alL values

of the third variable, that is,

:2(ljk) x* (29k)In C depends only on j

x* (ljk) x, (2 j 5)tn 2jk)x2(lj5) depends only on k,

X*(ijk) x*(05,)In (i2 _x ( is independent of i.

x(i j 5) X*T T is

47

Page 49: The Information in Contingency Tables - DTIC

t III t c; s 1vodtz . I 'Iat~ A w~ t hki I. I~ a

~t i bJ LC: k*CJk) L ti Lji beL~t tp~~..~ .

\(Ij -. X( jk) /X(1 "I WOJ X*(~ix.i L.i)I Llw c.,,( iwa~kt

Nf~IMUO anlylpOU~h2s Of CO\kif~ fdfe~h. ll d1)

qivoln C. We fluvW 1Livo Allaliz IS t ElitorWIt (km T'ibloI I 1._1

TaIble- 13.2

Com~ponent Jue to lntocnildLioli 1). F.

Condi tiondl ilopewndenceof R( anid D) qmim' C k(x: 3b

[El fc":t of X(i-k) .venX(Lj-) alid x(jkJ 2 L(x*:.~ 4

No s,;cond-OrderintltCd:tivf 2L1(x:.\ J

Por Lteo pcrt tc.kav dUlt -t i n queti~on h'w\,- ))WQ 'Q1* ,

x ( ijk) U for j -t , /, , i- -2 anct a1l.io Lor :;()inQ ~t i 4 L,

j 7 , 8 , 9 , thc! es Lji..i Luvi for iw h en t ries ct Iclipond iiy

x* ( ij Q~ f or j -O, 7 , , b(,Lh or x*and x w i J L n ro, di LlI -I'

from' tue obLjLv rveLd vol ui.. Accordiri(I 1y 1,!L w; mu Lc aw

e( btiwLatc :* ( ijk) wlr ich is obtai i ticd by I i LL i ml tho two-waiyt

Lflar(ina~ls 01 the 2:<',x5 Lab I j:.l., 2, 3 , irld A

j-fb,7,t3,9. 5ivitiliiy lut x* (ijk)-x (ij-)x ( -jk)/.()Iu

the! 2x,-x) table L>-12,3,4, and x,*(ijk) *.;c( i jk) , O j< 7,,

48

Page 50: The Information in Contingency Tables - DTIC

We now find

Table 13.3

Component due to Information D.F.

Conditional independenceof R and D given C 2I(x:x*)=25.532 20

Effect of x(i-k) givenx(ij-) and x(-jk) 2 (X*:x*)=5.821

No second-orderinteraction 21(x:x*)=19.711 16f

Note the reduction in the degrees of freedom between Table 13.2

and Table 13.3. It is also interesting to note that when

actually carrying out the procedures for Table 13.2, the same

estimates and statistics were obtained as for Table 13.3. See

Table 13.4 and 13.5, Table 13.6 and 13.7.

It seems reasonable to conclude that the purchase of a

Christmas tree is independent of the size of the city given

the size of the household (j=1,2,3,4,5) i;ii lioiseholds of

size 6,7,8,9 seem almost sure to buy C:,rist',is trees.

The log-odds for the purchase ot a (Tiistrnas tree as a

function of household size is given in Toble 13.8. The

probability estimate for a purchase as a lunction of household

size is given in Table 13.9.

Table 13.8

fn (x*(ljk)/x*(2jk))= tn (x(lj-)/x(2j-))e e

j=1 -0.25862 0.86623 2.17024 3.40125 2.3716

49

Page 51: The Information in Contingency Tables - DTIC

'Q0 111 3 )x C j- ,'x ( j'

.[ (1/140 u.43L_72 214/304 u 103-13 219,244 = 891'4 180/18b 0.96775 7/3/82 0 .9146

I(-r more copLex situations there is also the Log-linear

canalysis, which is ot course avoilable for this problem too,

but it would not add anything to the analysis o- this

p'Irticular data.

50

Page 52: The Information in Contingency Tables - DTIC

L3 ,: 134

: , I ifi . I ,':4

L722 II_11

", - -.'-,I ~ . , '*'' " ,/ ,'I , 1 2 . 3 292 24 865 63. 44

C. : 0 "- 9 U "L .. j) 22. / ;. . C: ,-I 49 9532 62.5%' -* 2o 4: 30 4:. 6 .. LS,

2 ,; .; 'D56 32.098 45.213.. 2 , 974 12 036 8 . 5

S7 3 Oi 2 < * "C , 2 0C1 1.000 1.000i~~~~ I 2 0, 0 0 ; ;!O5 0; L00 0.000 1.000

: 0 J 0 .-0000 0 000 000tu

2 ' 4 13 9 13 40 79 .?2 12 1 7.184 12.953 42.76322 5 18 19 i3 19 . '6 17.795 15.134 32.157

2 3 3 o 4 10 26 .877' 61, 4.050 4.046 8.4102 4 0 3 G 0 2 6 0.636 1 7 0.9'5 0.903 1.78623! 1 1 2 22 7 0 807 1.026 0.964 0.8342 6 0 o o 0 0 ol 0. o000 0.000 0.0002 7 0 0 0 0 0 0 0.00G 0 0' 0 0.000 0.000 0.0002 1 0 0 0 0 .000 0. OO) 0.000 0.000 0.0002i 9 0 01 0 0_0 O 0 0 003 0.000 0.000 0.000

,able 13.5

1 3 2 1 2 2 3 J 4 9 5 S-,*ud* . * ,~n 2 . . = --

I 2i 17.233! 3%.S"$ 56.2 4 o13 03.8,733, 2l.123~ , )I41 91o " 50

I 4- 24.364 4I.26 33 1 0 :213115 11. 193 2 60 8 .97,11 12 036 8 .151 4 12 1 617 3 C 21L 8, 2 2 L o c

2 1 3.93 12. 16/ 7 1 Z i2 953 ,2.7632 2' 5.747 1 .I, 7.79 15. 32.1572 3 1..77 ,.61 i 4. . 8.4A 02 4 U 7,' . .1) 3 1 . 7362 2) 0.8G-7 3 1 .026 0.9 4 0.8342/ 0 0 0 0

02 3< 0 0 0 G2' 0 0 o

51

Page 53: The Information in Contingency Tables - DTIC

3 C . r . .

C ..

--

• ° • • , ':-. , -, •

C* C CD-

C, ) ...

Cm) C) 'u' -C".; , C <' C.

*) -(C *() 9 ~ .. .. . ..C* C C ,' "

r , u , ( , .:.... " c ' to i.11i

(70 ,

c.. ') CD) C7 ) C ll t Wi11 7

(C C - CC C). -. ) - ' r \ '

. .. . ,-"

C.) 2 ' ' -.9. I , .CAC ( I A C > l

C:..

" C) 0 C7 ) . ) ,I-- C'',, ,' , , 52

C" i C -- ', CC ' * .

52-. C-

Page 54: The Information in Contingency Tables - DTIC

' t "V ~ i - i L F ,I -

rcf -utliu ed :.Av VOndei Grant

31 3.. ueluni and 'li qIp icedtto Lof -the

t CCtl Ld I ;yur pai ticl 2 U 2O 1t ) y the

IN a I. Na' :.e.Iaai-e unv r; Contr -, ct .'dlmoer IhUu0 i4-67-A-

cz u, <t 147' 0 24 and! 1) tli.: :,rr Psearch

.~xi ~nc~~handAir Furee2 Office of

-1 'L)t iC t~e~ i~ Contrt No . n 0 0O 14 -b 7-,%-0 214-0O01 5

CcOi,,utation3 were carried out at the ComTputer Center,

The ~r~ aeLaUnUniversi ty. Thu2 support and encourage-

ii,2nt o~f P~roiessor Henry Solom-ion and Profesisor Herbert Solomon

are yr- __ utIly ici-xw1k2dkyid.

TILhis paper. wais tiie basis of lucturos presented by

Pruic-suu t ,. Ku~ik15 iuno 1973 at O.SakaL Univrsity,

Osa~.,<ei ;20J ,JUn~ J973 at The Institute of Statistical

.*itnua i C , 'e~ys ,Japan ; ind 6 JulIy 197 3 at. ii oshim-'a

U11 IV, YS t-Y , III C:. 1'. -I , 'J -Pn hu oppertsriity ind iiiv itaticons

~w pPeri Laiis ntr I we-re matde possibl, a ky thie partial

5upL)(IrL oIn.- inLet en t ()f th(u Jipan octyLu;- thu Prormotion

of ;uLcnce , Pr nrNiznor u ;c kL1(JLC11i, k'rufu-s sor Ki yonori

KuuisaaDr. Yukdv-k'QSi Kawadla, Dr. Kamiuo Moltu-sita, Processor

U. .~n~ii nd PLIol ssor Suiiiya.su Yamamoto. Their support

andj C(MIllItr aLien urk -ralt eLui~ly acknow le qtid.

53

Page 55: The Information in Contingency Tables - DTIC

'i'nu bibliography is essentially a compilation of

those propared Dy Dr. Harvin A. Kastenbaum and Dr. H.H. Ku

covering the period through 197L and permission to use their

results is jratefully acknowledged.

54

Page 56: The Information in Contingency Tables - DTIC

lxt i lb ii c, ii t icV yA tab 10 ei LLC Ii anay ;ss

-,(); ''a'l iAt~ ity to iP

55

Page 57: The Information in Contingency Tables - DTIC

REF ERENCES

1972

AA.V , J. and ENKE, ii. (1972). Analyse iw Icrdiiuen;i (,ana Lr kont i n!La v!In

:,Lt iitlfe d(jc nutr.jt on ina::;es von Kuilback. Biometri: ,t-

Zeitschrift 14, 5, pp. 305-323.

BE.KKS'N, J. (1972). >Linimun discrimination infor-'iation, the "n. inter-

action" probliem, and the logistic function. Biometrics 28, 2,

pp. 443--468.

BRUNDEN, M. N. (1972). The analvsis of non-independent 2x2 tables from

2xc tables using rank sums. Biometrics 28, 2, pp. 603-606.

CAUSEY, B. D. (1972). Sensitivity of raked corztingency table totals to

changes in problem conditions. Ann. Math. Statist. 43, 2,

pp. 656-658.

COX, D. R. (1972). hlie analysis of multivariate binary data. Appl

>tatlst. 21, 2, pp. 113-120.

DARr OCi, J. N. and PTCITFF, 1. (1972). Generalized iterative scaling

for log-linear models. Ann. Math. Statist. 43, 5, pp. 1470-1480.

FIE 1 K RG, S. E. (1-972). The analysis of incomplete multiway contingency

tables. Bitmetric ; 28, 1, pp. 177-202.

GA[l,, M. H. (1972). Ni:.ed quasi-independence models for categorical

data. 0iometrics 28, 3, pp. 703-712.

CART, J. J. (1912). Tntiraction tests for 2xs.-t contingency tables.

Biometrika 59 2, pp. 309-316.

GOKIL\LE, D. V. (1972). Analysis of log-linear models. J. Roy. Statist.

Soc. Ser. B 34, 3, pp. 371-376.

GOOD:LV4, L. A. and KRUSKAL, t'. II. (1972). Measures of association for

cro;s-classilications, IV: simplification of asymptotic variances.

J. Amer. Statist. VA ;oc. 67, pp. 415-421.

56

Page 58: The Information in Contingency Tables - DTIC

1 1. 1' 0 A . . ( . 7.:) . 1_ - I r .L 1; all~i tests

W~jP t; IM, *. 1 i N~Ili" f, D. 1.. (197'). Linear modlel

7.>' K . ~ i L iLcr~U~ereopnseVectors.

N AT I U,: , . ;. >.. oti o.er of ttea foatr independence in con-

t i nigen VL L fi' strati'icd samples. -,I. A:'er. Statist. Assoc.

6Y, pp. 9 1 1t! )

VICToiR, 107,( ~ 2). 7c'.r !klassifizicrung mehirdim-ensionaler kontlngenztafeln.

1971

ALTHA:I, P'. 7. K. 19 971) .Ex.ct Baiyes.-.izn analysis of the intraclass 2x2

t at) 1. hi6uij , -, pp. 6_/9-680.

ALTILVI, P. Ei . (1)7). The analysis of matched proportions. Fiometrika

5, 3,p. 1-l

BL.11:, ;.V. 'Ind COP::iI. , R. G. (1971) . Srnhe gtests ,)f symmnetry

in cuant i ngeiic', uy Biomtrics 27, pp. 1074--1078.

1ll;hiw" Y. "1 . (i. Ef ecLs of collpS i 1 ; ILI.It id imensional con-

ticgcn'! t h. . ,t r i~ i 21, pp. )-#'-562.

COHLN, .7. (19/ 1). 1Li :i ioii and into r:ic Lion in ;I censored 2x2x2

CO~t11 1'.iIC tt~e 7'2 ti 27, 3-/9-3 (,.

DE~1'SH* , x .P.(1971) . An ove rv iew of Ird lli VarLV Latoit L analys is.

Joourn!: 1 i v V i. itI- r1j~ 1, 316-347.

57

Page 59: The Information in Contingency Tables - DTIC

44

Ft.E 1 . G. (I' 1"). 011 ";w iLslt of te'Liw ~ ia htio.c

a it 1JI tiI iifl nl iii coiitt i enc table J. 'tlti-;t."' S r A

13/4,p 368- L

CA!R'f, J . .1. (1911). ITI thle 0Ol~rrilig 0f cohii~in,,iic2 tablec; for silini ficance

tes t.rcna' is 13, pp. 910-9 11

CA"RT, J . J . 19 71) . Ie cempar is on 0 f p ropo r t L a;: a revie; () f sigllifi,-

cance Lusts, (oa i donce intervals, and -.dji-itments for s;t',1ifi.-'3ion.

Rev. 111. inn. StatiLst. 29, pp. 1148-161).

GOIKIIALE, D. V. (i)7 1) . Aa iterative procedure f r anatyn I ug I oi--l inear

models. Biometrics 27, pp. 681-687.

GOOI)MAN, L.. A. (1971). Partitionind' of chi-square, analysis of marginal

contingency tab lei, and estimation of eXpcted froquea,-cies, in multi-

diinensio-aa contiagency tales. ". Amner. Statist. Assoc. 66,

pp. 339-314.

GOODdANN, L. A. (1971). Soeinultiplicative models for the analysis of

cross-cla-sified data. Po.Ct Berkle, S.'m). , Berkeley and Los.

An,,.es_, U;;iver.Aity of California Press.

(;0C)1).'N L. A. (19 71) . The analysis Of naIilimiros jonal contingency

Ltbles: stepwisc procedure,; and direct out im-tion methods for

ba.ilding miodel,; lr multiple clas~sifications. T c hn omet.ri a.s 13,

pp. 33-61.

GR1Z"LE, J. 1-. (1971) . >Eeltivarlate logit anal'"sk-~ Biametrics 27,

pp. .10 )7-1062.

JOHNSON, D. . , and K0( it, C. G. (197/1) . A note on thke weighted least

squaret, analysis Of the Rice-Sith contindency table data.

T echne)mt r ic s I'i, pp. 438-447.

KOCH, C. G. , TMREY, 1'. B ., and RFE1NFURT , D. 14. ( 19 71) . Linear model

analy;is of catejoric-il datai with incompleto. rosponse vectors. Inst i-

tute of Statistic., Iiriec Series No. 790, U~ni versity of North C'irol Tha.

58

Page 60: The Information in Contingency Tables - DTIC

i it ion ;ip V. -oen j! A ,.1,c2 0: if I [I;; t 'it of

I .t pp I ; 7--i

,X, H .H 11)7 1 i~) An tys M:iormi Wi n I i '-:tve ap proa,:h t o

tnc- Lvt-(tj i, 1 W' In: -I..-; :, a c sibs

ill : r~~~ ~i1 ~ o 12 , 7r

KU, H1. Ii. , ViURXER, R. , and KLJ .3 .K , S. (1171) . On: tie analysis of

mul t iim ens~iional ICOnLi nqe uy tJd)es. .1. A!:er . S tatirst. As soc.

6G, pp. 55-6'1.

KULL.,.CK, S. (1971). Aiin1homogeneity Of multiditisiounll Contin-

geilcy tables. -AMi. i.,Ll. SLatiit. 42, ppr. 01 --- ot.

KULLBACK, S. (1971) . The liunogencity of the sex raiin of adjacent sibs

in 111-1-11 fB~ Icn c,-:o t r i c s 27, pp. 4.92.-457.

AJ . (1971) . Oil tw.o LeS Is for vwrng d~ ion rImp-ortions.

PE~ ,P. * ( 1). iAc 1ionl-c r iI i tv 'r '. ivt ri-k"s f rorn

cli (.1 ;Iit :L 1: i -. ii -- r c 2 ', .

P L R LIZ , L. (1 ;1 .,I ii; li-i l t iU it io I tc-.o :e i ii 1 rojaiL iies

.AYi.I . 1) 1) T 1ii- 0 jnI., 1i- 0 nIC01-1 t LiiOn toc bounided

suw. varicaltes. . CtS-:;; I :>'I'ooi of .:t ~ a~i sciva:es,

the FlI ndex-s khin c*rcr y of Sou th Ma;raY ia, h1 on 191).

wi tr applLcatiin!. t o t i'ie, h i:clL.-RpAr . 32,

1ovamber 26, 19/i., Iepwirtmanrt Of SUtoist ics, S;(,mnord oicnty

59

Page 61: The Information in Contingency Tables - DTIC

I ~.I'.. . J11 t:, i ~ii o :.'ri (a ,; est

I1 t- Ci i ;.t Is t-ie i C I iiI I xcx, colit i nyEncy

ta I:. K~a i;~i. i [az2 Coo 1 :7 k'Len ili vers

7 )1, . i:i; ;-'ieral 2x.' coiAJi'-Lncv tiles.

19/N

A, iAM PA, K. I LA M . 11 . (I j /0) . '[ie u z1re'C aIsci at ion of rows

and en 1 .LiTfl~ or ii x:;i cozt i gencyta '.fl Stis OC

Ser. 8 ipp. I7.

Bi~X~,V. . (979 .CLL' ,.oricai I dita analysis of son.e miultivariate

t ce;ts . 1,;avi; j n 'r~u ii Lv act- Stat ist Li cs R. C. Bose et al

CGS. . . kt Ur iVL'11 it > Of ;:,)ltit Carol ina ?rs- , pp . 8V5-110.

LL- 1;,1 ,, L . 1,. (19 0) . !.AuLivalLnce of Gauss 's priT-ici ple and raniiuip.

di~er~iiiionill'Ii; Lt n l c stirriationi of prob ib ii .i ea. Anin. 'I-ith.

SLt i i' t 41, p.14[LOS

CO v, DU. R'. ( IT 70~) '~'AaI; n i Ii aik ,ry v 0 t C It e I & C o ., i ,L d.

L 0f 11 Q1C)Ln

CA.ADWC(,rl J . M and 1) 1;o', C.. R. (1979 ) . The d i ;r rLul'iol or thco chi-

ustib dioa f i~;ie~CJ ii8e~~ in~ Ls J. tr ;r. Stziti st. AssI9c.

65, 332, pp. 1010-1616.

FTLiPZS. . . (1970A). Anf iterat-ive procedurc for est imation inl conitin-

60

Page 62: The Information in Contingency Tables - DTIC

i , to; ( 9 "1 'Hit i~ ;al i: ,f 11.11 t , ti t I . I ( ont in".erlcy

Fl E0Ip' ;, . i.. !1.ri1 ! :iK.RT, .1. P. 19VA). tw4o i,.' t*.i

Cont irlcIOiicv l..~ LI Stt iu t. IV. 9 -J I.

FIFNBLI\2 , S . E . aritt C(.!.i P. - U. ( 19/U0). i ht f -i I i:iit'.I i ti,- ze ro

Coun t- ill Colit ill c :icy Loib I' s . i\anddl'!!r C>'w. i t

(G. P. Pil! I, e,!. . nte Pm-nsy1.'ani-V Startettnir

GOOD), L. 'J. , (U 1R'. I.and ITCWd.LL, G. J . ( Ir . Pxa~ Ji tr ib u-

ti-on, for X-sqnored(-( and tor the likel ihi) ,-r Li,)' .. 1 tic for

the t:quipruhob heA( riiltiiluimb!a Jistributi'm. . . rit i t Assoc.

65, pp. 267-283.

COWMXA2N, L. A. (19M0). The inUltivariate analysis of qualitative data:

interaction among multiple clas,;ifications. J. Amcr. Statiut. Assoc.

6'), pp. 226-256.IV,,

KAkS7E:;3AUX, IM. A. (1970). A review of contin-oncy tables. Essays in

Prolo.ibiity ai S'Lot4Sisc~s (R. C. B~ose eL ;il. , e.>. . -he

Univers ity of North Carolina Press, pp. /407-4 32.

KU1.LBACK', S. (1970) . Vc ius appl1ication,; of 'dn~.ti-1criniratioa

i cloi:i-t iofl es tli:t:ion, pait iculinrly to p roulIems of contingency table

anlsi.Pruceo!ilil'; of tht. Mo(etitig Onl Inlror.,c!t ion M!easures,

t~rn~ir Oft ol hnt rcoOutd , Caniida, A:i )-',KIUO , f-33--t-6tb.

KMcLBACK:, S . (19/0). M itindi!7r1i:natiotl I ia-. stLiition and

apli~tba l.~~ in:;Ur h Sixterit.;cotl tn on WDe-sign

at Lp roii . r'7'.Ioscairch, leve lop.. L, :!nInd ?I g,2 October

I 9/u . A ,)- 1) Ep at 7I~, i Sirc i nre: t-- C 02 c t relice.

VANTEL , N. 1(U1(). Ii pht otIncle'tlr I .'tcs 2

NUL, Y?~I.(11)Jj) . On coL-t Lna piorabLii icz; Lin c onrt igoncy tables

Wi~i Lu itlitLi ant; O i rgi1h Fr.1). d i~;ert t iwl, Thre Gcor:'e

Wain itla( jia irivor:. t y, FL )tierry L9) 0.

61

Page 63: The Information in Contingency Tables - DTIC

OIX) RO 1--F, C . L. (L9 /U) iim Logit chli -!-qt: irtai.tL nsmziL.

i ke Ii I hood es ti Ii. t ion i t ront ti I I z ucyItk' I(-"; . >~ .Jat is V soc

u., 12,pp. '~ --16 1 L

WA G 'N'F., S . S . (19MI), Ti Te ieaxi uie-lI ike I ii Iu I~ uo d a lv r con Lii vy ncy

ta ,bles with zero diagonal. J . A:'er . Static t. . Acauc _,. 6 5 2

pp. 136-1-13,33.

1969

ALTHA1M, PATRICIA M. L. (1969). Exact Bayesian analysis of a 2x2 contingency

table and Fisher's "exact" significance test. J. Roy. Statist. Soc.

Scr. B. 31, pp. 261-269.

ARGENTIERO, P. D. (1969). X-squared statistic for goodness of fit test,

its derivation and tables. NASA. Technical Report, TR-R-313.

BISHOP, Y. M. M. (1969). Full contingeticy tables, logits, and split

contingency tables. Biormetrics 25, pp. 38'i-4u0.

BISHOP, Y. M. M. and FILiNBERG, S. E. (1904) . Inco.:plete two-diacnuisional

contingency tables. Biowetrics 25, pp. 119-123.V,

DEil'aSTE", A. P. (1909). Some theor-y related to fitting exponential models.

Recterci Report S-4, Delmrtitcent Luf Statistics, Htarvard University.

FT EN?;iE,(;, S. E.. ( 1969) . Pro Iimiiiary gr,-phical analysis and quasi_

indepenidence for two-way contingency tables. Anl. Statis 1,

pp. 1318

GOOI)MAN-V, L.. A. (1969). On partition X-s :wired and deLucting pautiIL

association ini tlle tnirec-waiy cent!Ing-CIIcy Jab!s Rov . _t fi st.

soc. Ser. B. 31, pp. 41E6-498.'V

G(;RI Zz 1, j . F,. , S TA iW ,! ' I C . F. , anad KO C II, C. C. ( 1 .i9) Analysi of

categor~ric 11 data by linear models. i'lioietrit .25, pp. 1

62

Page 64: The Information in Contingency Tables - DTIC

jH :f , C . i'. ( 1 . . ac . i; ef. it)I I[1'I

ma r,.,i na 9. LIio !;i t y i an Vi x Oa t i'Igt I, ;~e Ai.w A . Statist.

KOCH , G. G. (19W)9). iicefti ccL ol non-fJmpik wrcrori oni imesires of

assuciati-n in Ax! runt ngenCy tables. j. r.ttit. Assoc.

6,pp. K hI

KU, H. M. and KULLVA'2 , S. (1969) . Analys is of nrl idimens lanai contin-

goey tablens: Auinf ormation I hen retical t~~a~ Coat ributed

papers, NO/t SeSSion of the I atern:iL Lanai SLtistliL cal Institute,

pp. 156-158.

LANCASTER, Hi. 0. (1969). Contingency tables of higher dimensions.

BUlletin OfL tite FI e etLiUUi StitW isLicl.i vt.iLute. 43, 1

pp. 143-151.

LA:NC.ASTER, H. 0. (1969). Thte (hi-Snewired DistriAU.L ion. Vililey, 'New York.

11~iti. N.(I4 ' . LMTand tht.: hypotheses of no throe factor inter-

aCt ian En coa nln,,, -y t ib!, J. I;r Stat aL . A~csnc . 64., pp. 207-

215.

l~AQl iP. 1,. (1969). %MW t E=i:WA Ea- ci nan aies. A Survey

of -.a 1eV; -:n,,,'t 0g; , L ~m o ie ic;t:a Lottca

I n-;t Liit w .1 ,p 13 3- 14

[ILN9IKTI , i;. M. ( 19,,S) . ron X-squa red tea-ts iar ma tchcd s amples.

J.Roy.- Stt t . 0, 39 ) ', pp. 368-1h/0.

BL.I\;S0N , J. (9).AtI LtoOf I-inIIJ10II !iagit 1Li-Sqt~ated. estimlate to

a prIt(!Ii 01 %4rI I' 110t aot.0inc on thle ofb~r( no interaction.

R ioanet, r i s214, /5-96.

Page 65: The Information in Contingency Tables - DTIC

h iLIA , V. P. 118 C) a I Iut ;tt .y s o f L i I "c L 11) ILl t

(ILL, ror i t~itjive T ctruon ;e . BJ- L''W ic 24, j) p. 2

BHAPKAP , V. 1'. and Kk&:i , G;. G. (1 ] ': ) . H''ypot. cc;, of "ic i oLcrlcetion" il

rILo lt j d ii men-S ioni I con t inioe cy t ab I es . Te ch no., t T- i ( 10 , .10 7- 1 3

BIuPKAR., V. P. anmd KO~il, G. C. (.1 963). On tite 1iyporll-se - of "nro inte r-

actioit" in cont inaency tables. Biornet ricc 24 pp. 56;7-591.

FIENBERc., IS. E . (19116) . The geometr-y of an rxe.. c cntinogency Ldb Lte

Ann. Math. Statf!;t. 39, pp. 118j-1190.

GOODMAN, L.. A. (1968). The analysis of co-caif'!data: indepen-

dence, quasi-independence, and interactions in caont i gency tables

ufhor without miissing entries. J. Avmer. Staitiqt. Assoc. 63,

pp. 1091-1131.

IIAXL)Ai, M. A. (1968) . Optimumn cia Lce of class3,- f or o nt iioiicy tables

j. A- Ir. A: -sec.~3 pp. 21,17)29"

IlL~i ,C. T. aiid KUTLBACK, S. (196'j). Cotsg.:.y l c. ; ivten

M~l iri~ TI I eISt ika 55 , pp. 1794- lS.

I RUL. 1) U. 'i . ani K t I OACK, S . (I19 6) . '-Ii I

a I, ' i r... t oi. ;WI't-t~ r i 2+ p 7 1

KU, H1. Ii. , ANM .,and K LA S. (196c;) .tc ric ii il I)

Onl 0t, DOSIgol Of F.> periaento; in Arrmy 2s~ro,

3.esLtip.. AKO) Ceport 69J- 2.

KUEILBACI, , 2. (1966) . i lcoroiit ion Thellory and- Statt. i.; t i j);. ut Id

Inc. ,XwYork .

64

Page 66: The Information in Contingency Tables - DTIC

K.. VAL\,,, ( J," , I i !,i iIlly wit i tr; riI i ,l Ann.

.,t~ik . ;;t , . . p . .' ',--i.

, . iK. (i i 'i . iK pid L2.ii c'.1t-.M > t> ,i, ti]it:ies for 2x3

C o c h i. ..... .. .. ,; IPp. l'"1 -- iI .

-AT1HI Lti, E.U-RLW ,nd ,.AM3. .,!, . (]i,)_. iUi te, .. 1 id intito. des

L- iI; Ln ) ib ,I e ao rr,- 1 c l tio1. C. P. Aca' 5c:i Paris 26I,

pp. >32-834.

.IOSTELLI-tR, i'. (196). A;sociati ) 1id u..:,. ti :iation i n cont ingency tables.

J . A;, r a it.i . , s o c. p4, p. . > .

SLAKTER, !. J. (l9o ) Accuracy of an approxit'Lation to thle power of the

chi-square goodness of fit test with small but equal expected fre-

quencics. J. Ai:mer. Statist. Assoc. 631, pp. 912-92:,.

SUGIUR.., N. and 01AK', 11I. (1963). NuncertcaL co:Tpari ,on of improvised

reti dt of iel rCoi' i tn ti n 1 nT7 c" taibles with lb e frequencies.

Ann. Inst. Statist. Matti. 0, pp. 507-517.

1967

S, . (1 t 7). I, tss of hypotht, is cor:-n n, matched samtpIes.

J. ]'o '. :t:tis;t. e .- r. 29, pp. 468- ,!4.

Y X..... T'., . S. (1Q0/). Iu! lLIV L On tc:r, of the multi-

(OK, i. . ... , , I .' t . A ho ic" i p i .al an,- lysis of

m, ..... i .,_ : i i ,: i:. ,n :y 'I:. , . ; ,. , , , i -; 9 , pp. 4 1-48S.

(, O1, 1. ,. 1 )o7) . A L, .i i , ,nit t t ;t icr iillitnomlal distri-

hut iota. T, . V,,'. Vt.! ,. , . s-. B 29, pp. I3')-4II.

65

Page 67: The Information in Contingency Tables - DTIC

19916

AIRMtTA(;i,, 1. (19th). Lic ,lhi-siarc t f:t for hi,:t('.. ,,'n' i Ly I pifport i

alter .Idjuot! ;! .C11L Iou .t; at , it ic it ,t. .1._-;. ;ia v;L. 5,:, 5c. r. I;

25, 'P- 150-ti

B i'i.\KAI, V. P. (M)6O) . A note oi tthc cqtii ,alnce tol ts, ;t ri ei ria for

hypothc;es int c cte$rical data. J. A Vwr. Stm fm,. An\,I. t.

pp. 2?8-235.

h iAI'KA R , V. P. ( P66) . Notes on analysis of cat ego i cal dat:.i. I istitute

of Statistics Mmiie) Series No. 471, University of North Catolina.

HAT, B. R. and KULKAKNI , S. R. (1966). LAMP test. of Linear and io,,finear

hypotheses in multinomial experiments. J. Wcr. Statist. ASSoc.

61, pp. 236-245.

COX, D. R. (1966). A simple example of a comparison involving qiantal

data. Biometriko 53, pp. 215-220.

-CR\ADD0CK, J. M. (19b6). Testing tim significance (f a 3x3 contingency

table. T!,e Statistician 16, pp. 81-94.

G;ABUUi;L, K. R. (19)66). Simultaneous te<ot procJures tor nultiplc: compari-

son on categoric.il. data. J. A-roer. Statist. Ario- c. 61, pp. 103i-196.

G.\RT, .. J. (196o). .AJtarn,tive .- al'-..; of cL.tirncv tab It._.c_

Statist. So._fr. I 28, pp. 164-179.

G)OD, 1. J. (1906). .;j to e-stimtte jrobabilities . J. inst. r. . Attn I.

2, pp. 364- ,.

KULL.BAiCK, S. and KI!,.I Ar, M. A. (19,6). A note on mmini mu: di;crinination

information. Am,. .th. Stt st. 3, pp. 279-280.

MANAILL, N. ( 1.6) . ?Iodcl. for cuio.lex 01nittcy tabl's and polychotomous

do: ~;< L t'!poIl:;,' CIWves. Ii'txe t-[t' 22, PP. ' 95.

66

Page 68: The Information in Contingency Tables - DTIC

ASAINO, C. (1965). On cstin.ating ioilti:n-ioal probabilities by pooling

incomplete sa:-.p! L;. Ann. [,,Lt. St.atist. j-!t~h. 17, pp. 1-14.

BHAPKAR, V. P. and KOCH, G. C. (1965). On the hypothesis of "no inter-

action" in thre -dimensionnl contingeacy tables. Institute of

Statistics Kimec Series No. 440, University of North Carolina.

BiLAPK NR, V. P. and KOCH, G. G. (1965). Hypothesis of no interaction in

four-dimensional contingency tables. Institute of Statistics Kimeo

Series No. 449, 'University of North Carolina.

BHAT, B. R. and NAGNUR, B. N. (1965). Locally asymptotically most

stringent tests and Lagrangian multiplier tests of linear hypotheses.

Biometrika 52, 3 and 4, pp. 459-468.

BIRCH, M. W. (1965). The detection of partial association II: the

general case. J. Roy. Statist. Soc. Ser. B 27, pp. 111-124.

CAUSSINUS, H. (1965). Contribution a l'aaalyse statistique des tableaux

de correlation. Ann. Fac. Sci. Univ. Toulouse 29, pp. 77-182.

GOOD, 1. J. (1965). The Estimation of Probabilities: An Essay on Modern

Bayesian Methods. Research Monograph, 30. The 14IT Press, Cambridge,

Iassachusetts.

KASTLNBAUA, M. A. (1965). Contingency tables: a review. 11RC Technical

Summary Report No. 596. :1atIheinatical Research Center, The Univer-

sity of Uisconsin.

KATTI, S. K. and SASTRX', A. N. (1965). biological examples of small

expected frequencies and the cai-square test. biometrics 21,

pp. 49-54.

LAICASTER, it. 0. and BROJN, T. A. 1. (1965). Size of x-squared test in

the symmetrical r1ultinomials. Austral. J. Statist. 7, p. 40.

67

Page 69: The Information in Contingency Tables - DTIC

i~lW,,,l : }. U. amd F:iSi.> KiN; (L JL<)) 'lilt i ,~,.t :.. ot h niog:nP ity

t, es t ; in 2xn t,, le . 1i omet ri , 21 , pp. 19-3

NOTE, V. J,. wd ARNtJN , P. L. (1 ). An i.nve,;tLigat ion of the effect

of mi,;lassiiicatiun on the properties of chi-squared tests in the

ana]y,;is of catc.orical data. Biometrik 52, pp. 95-109.

ADiLAYIRLSHN'A, S. (196)). Coi:bination of results from several 2x2 contin-

gency tables. Bio:etrics 21, pp. 86-98.

1964

ALLISON, Ii. E. (1964). Computational forms for chi-square. Amer. Statist.

18, 1, pp. 17-18.

BENNETT, B. M. and NAKAX.MURA, E. (1964). Tables for testing significance

in a 2x3 contingency table. Technomatrics 6, 4, pp. 439-458.

B11OSS, 1. D. J. (lo:'). Taking a covariable into account. J. Amer.

Statist. Assoc. 59, 307, pp. 725-136.

CiIEW, V. (1964). Application of the negative binomial distribution with

probability of Lisclassification. Virtinia Journal of Science 15,

1, pp. 34-46.

GOODi):-, L. A. (1964). Simultaneous confidence limits for cross-product

ratios in conting,2ncy tables. J. Roy. Statist. Soc. Ser. ti 26, L,

pp. 86-I'J2.

cGOOi>"N, L. A. (19t4). Simple methods for analyzing three-factor inter-

action in contingency tablet:. J. Roy. Statist. Soc. 59, pp. 319-352.

GOODtMA\N, L. A. (1964). Interactions in multidiini:siounal contingency

tabIL.!; . Ann. Math. Statist. 35, 2, pp. 632-o46.

COOD>AN4, L. A. (1)b4). Simultainoous confidence intervals for contrasts

among multinomial popalations. Ani.. Math. Statist. , 2, pp. 716-725.

68

-L = i . . . . ... . ... . .

Page 70: The Information in Contingency Tables - DTIC

I tI.. F '' I 19 )t - Goi 1) .r I i I t I I p ,r' I'inc ions

for t.,t t ;t ,I , i! . coot i i .i t It ,. Ann. Math.

•d 1 , Hi ' • '

LI1.), . V. The( l:n4) . Ta !n si i i ,i o t ( k Ii-i: ,ency tables.

An:. t , . . ] f' i t 3-164 1.

PI.A . , . I. L . The c( .ti nt it' corre'tn in 2 2 tables.

Pio'rvi , Piarts I andt 4, ' I 27-318.

2PUT"i:]R , .P. (1 %4<) . "i.e , goodnzss- f-fit LesI for i class of cases of

dopt),ldmt e ,,'r'd<t n. P~i,-mlet rikat 51, pp. 250-252.

SOY-RS, i. 1H. (1964). S mplm measures of association for the triple

dichut(-my. J.~ Poy. Statist. Soc. Ser. A 127, 3, pp. 409-415.

TALLIS, G. M. (19,4). iU use of mod, s in the analysis of some classes

of conrinigency .b~e. iBi<,' trfc.; 24, 4, pi. 832-839.

bM.iIE, B. M. and waum2,kA, E. (1963). Tables fur testing significance

in a 43 Coantji:n y.,v tiM e. 'T c ,:"orct ric- 5, 4, pp. 501-511.

S :ti!, f~l. W. (I9 3) .Msn-:in: like lihead i n throe-%',my contingency tables.

. , i<L .* , :. >,r. * ; -,, L, pp. ,20- .

IP, At:'!, .1. 5. ,rnP .V1.Y ,S. D.(1)5'). On tAi,; more than one

hy ,,, <.<Ki:. Aul. > ti. SL;ti t. :14, 2, pp. 5)5-567.

hi A',:X T, I . i,e. (I I ,. The li .i tin,, prct.'r rnf C -itt!,' rical data chi-square

t tj-,L:- ,'ln llO$'OU-; t+i nor:',tl .in a lyn;i< uif vari,mice. Ann. Math. Statist.

34, pp~f. 1 7 -] "

F.D'.:A >IP;, A. U. F. (19T e). "'h w in.is irt of association in a7 2x2 table.

,I. ,,'. Stti.t. S,. .;'r. A 17 , 1, pp. 109-114.

- n KL Xhi 1, E. (1963). Kort cut c~lcNul.itiot of the

Fie ri-- a,ite; ex-it . y ' i, ,v: t ri, in pp. 289-291.

69

Page 71: The Information in Contingency Tables - DTIC

GOOD,!, 1- A. (PLi)63 . Tf- s i 1 t Ltvp. t I;. u It 1: -1 AI9 I: Il

fuor mul L id( i::Lns iun~i cot in tn; v Li 'I") I "s t .2 tI .t . 34 ,

3, Ipp. 9 11- 9 3'

CIOOP: IN , L . A. (1 NO 3) . On me thods for c ()r:pd i i-, ( t i iiu.tlcy tablehs.

J. ROY. St-!Li1Lt. Sotc. Ser. A 1~, , pp.9-D.

GOOD~iN , L . A. ( 1903) . Oni Placdett te- ;. for (:ont i igtecy table inter-

act-iOIIS. -J. Roy. Sto!List. Soc. Sir. B 2.D, 1, pp. 179-188.

GOODX-1AL, L. A. arid KRL'SKAL, W. H. (1963). Measures of association for

cross classtfie-ation f11: approxiwate sampling theory. J. Amer.

Statin;t. AS;ooC. 55', ly. 310-364.

iKU, i. Hi. (1963). % note onl contingency tables involving zero frequencies

t1- '1 21 t-vs'. T ci~untc 5,S 3 pp. 3K43

>tXV[,N. (1963) . Gi-squaire tests wi th one cd,-grou of freedom: eXtenl-

sions of the ILIntel-Iaenszei procedure. J. A"er. Stntist. Asc

56 , pp. 69J3-/JO).

Ddw;:,1. J. (11,63) . Mtisclassif ication in 2x2 tables. Biorietrics 19,

1, pip. 15/-1ft8.

0 A ' ,14 (19r)3) . Chi-sol; are , altijbaSed en1 the poole,1 f rt~queacies

o1 j ven 11 ))3 o -Vse , t- i0 iu; . l'i urx tr '-Lki 50, pp. 524-528.

RIES , It'. N. an(-I iiIH (1903). Tncus of chi-s-quare for preference

test iu,; in ;Al L i :uslUa L roI r.s.Chii [I . Plrof~. Svrcpo(SiLM

-S~' 59), 42, pp. 39-# 3.

31. E.. ( 19o-3) .Los:. inl tC3t e I iciency due to misclafssif icatiori

for 2-,2 t ab It-, B Biiet' rics 1.9, I , pp. 1 58- 164'

70

Page 72: The Information in Contingency Tables - DTIC

DALY, C. (1go.'). A [ i. itv t tor t T- ; I! in;'. icy tb eU. Bio-

i~t~trio! i', I., pi . 1 ' i 9.

DAIJP)0 i[, 9. C. (-"' 2) . !nt 'ractil:; i il i. t . cciltii1;,,ency tables.

j. RZI _. S ;,. Ser. B 2., 1, rp. . -5 ? L,

FISIt N, SI R RO AIL\D A. ( ' 2). Coiif ihi, e I mit f , a cros s-product ratio.

Au ;tra . . St,tti s . 4, 1, p. 41.

CARl , J . J. (I AC 2) Appr x 1i; ate cof i ence liii t! for rolative risks.

J. roy. St-tLiSt. S-... r. B 24, 2, pp. 4+'.-:.t.

GART, J. J. (1962). On tile combination of relai-ive risks. Biometrics

18, 4, pp. 601-610.

KINCAID, W. M. (1962). Tile combination of 2xm contiriency tables.

'iometrics 18, 2, p. 224-228.

KULLBACK, S. KUPPKRtMAN, M. and KU, It. H. (1962). An application of infor-

riation theory to tw, , analysis of contingeu,'y tables with a table of

29 in N, N = 1(L)0,JUO. J. Rs. "at. Bur. Standard-; Sect. B 66,

pp. 211-243.

.LbLR .. , B. , I igPi'L.RfVN, :-., and KU, If. il. (196:,). Tests for contingency

tables anui :,Ijiov chains. Tccinom-trics 4, 4, pp. 573-608.

LKI ;, !,. N. (1962). O,1 tNc analy:;is of iuteractio:i in rulti-dimensional

CorLtingi cy t.AblI s. J._Roy. Statist. Sac. ?"r. A 125, 1, pp. 8S-117.

PLAC(:B!;'i, R. L. (1962) . A notte on interactions in coiitinecy'v tables.

J. 1\,v. St ist. Sn:. ';or. B 24, 1, op. I62-i66.

TAILIS, G. M1. (19,62). 'hi,, :w::imum Ii Ieli hod C:.:at ion of corr~iation

from cnt rnc y tB,6l1 ;. or.tr I:,; !3, ; , pp. 342--3s3.

71

Page 73: The Information in Contingency Tables - DTIC

3961

BF:R(;E.R, A. (1901) . On- .o~ipirhn, :Iit-Wi' of t25 i',oAC ionif hvmm Im~f to

h iviiiiv ciiar~o tvrK~i ;tn: IL t,4 di t k t. 1XojI& in. K. _Yer. -Statst

,;; o 0 pp. *y9-9'c.

BIIAPKAK, V. 1P. (1961) . Sema te ;ts for CatL'guriC-1l. davi l. Ann.4tI. StatiSt.

32, 1, pp. 72-Si .

BILLN ,L~Y , P. (1(j61) . Stati_st icA T info rencc for Mark ov rs.o

Statistical Research honographs , TwUiesyAClic.a~.o Press.

CLAK1- (G-0LD, P. J. ( i1b) . Thie use of orthogonal polynomials ill thre

partition of chi--square. Austril. V. Stitist. 3, 2, pp. 48-63.

FRIELD-VNULR, D. (1961). A technique for estimating a contingency table,

gtivcn the mcrgiria~l totals and some Sopplem~ntary data. I. Roy. Statist.

Soc. Ser. A 121~, 3, pp. 412-420.

(AUCO',, . (1961) . Coutingency tables with a depondcnt classification.

Ali.Aral. J. Stat i;t. 3, 2, pp. 42-47.

GPILZIL, J. F. ( 19bl) . A niew method of testing hypothes;es and estimating

ji~vI:;eters fur the logistic Model . Biometric.s 1Y, 3, pp. 372-38l5.

M. C. am' TAVA. (1961) . The__Advance! Ti. ery of Statistics.

!'Wirl W~ rilin ind Coipai-y , L~oldon.

.2 Am . nd I SHL[1, G. (1961.). Tea;L Of i ndtepcns1'1mmC.e in i nt raelass 2x2

t' 1)1. 1; ifr' t r /48, pp. 181-190.

LUICO U, 1.. (1 U(, ) . A 11ote on rIlleliaUC cvut crrors; ani 01atecting real- differ-

COCO.;. J'. A:e.S tint. Asoc. 56, pp. 014-119.

S(>iUI.L, W. J. (1961) . S,io. problems o, analysis of multi-factor tables.

iLu! I. Tiot . riitecrr i . St tjt. ?S, Part 3, jy. 239-270.

YAT E-S , F' (19 6 1) !Ma rv, fmiiIprcnta in multi,_ay tab les of quantal. data

wi Li (iisproportiotmte I roquemiciel;. ione0:(t -, i C; 17, 1 , pp. 1-9.

72

Page 74: The Information in Contingency Tables - DTIC

1 ) oO

BE VNN. , B. 1. and EISIU , P. ( 196O). On th p,.'cr i 'i;iut on of the 'xact

test f or the 2x2 cent i : ycn, y . , r ,i i 1 i I 7, pp. 39'--398.

GRI G .\UVi, N. T. (1 , ). Card- :z tt chi ii t× i :,.1 p r ,t cWnspectus of

theor,. J. Rov. St.ti' . ' . S,,r. A I? , I, pp. 4'--49.

ISHII, G. (1960). I trc l ;' ; I Ilt j ,jitie cy I -1) . An:.. li:;t. Sta

Math. 12, pp. 161-20/; correctiC)ns, p. '/9.

BASTENBAUA, M. A. (1963) . A not.te on the additive pirt iti.oning of chi-

square in contingency tables. 11:nmet nv. 1.6, 3, pp. 416-422.

KLPPEr-LI N, M. (1960) . On comparing two observed frequency counts.

Appl. Statist. 9, 1, pp. 37-42.

LANCASTER, H. 0. (1960). On tests on independence in several dimensions.

J. Austral. Makith. Soc. 1, pp. 241-254.'U

ROBERTSON, W. H. (1)60) . Prograii:ming Fisher's exact method of comparing

two percentages. 'ecntno'i:etri cs 2, 1, pp. 103-101;.

1959

,u;DERSON, R. L. (1959). Use of contingency tables in it(e analysis of

consucrer preference stu-li.e:;. B 4oetric s IS, 4, pe . 582-590.

CHAFRUA ARTI, I. M. and RAO, C. R. ( 959). Tabis Ior somne small sample

tests of significance for Poisson distributios; and 2x3 contingency

tab lei . Sanl'i~ 21 , Part-., 3 an~d 4, pp. 1I5-1)6.

GOODMN, L. A. and KRI!SFAL, W. H1. (1959). Neasureoe of association for

cros.s classitication 11: further discussion: aiid references.

J. Amer. Statis;t. AVioc. 54, pp. 123-163.

IRALDANIV:., J. B. S. (19Y)). Th, ,a lysis of heterogeneity, 1. Sankhva

21, Parts 3 and 4, pp. 2w)-216.

73

Page 75: The Information in Contingency Tables - DTIC

i!O'' ', .J., KPli !A: , I. R., ,:rid "L ToIURA,:, F. P. (l ). A y-;i', of

C(1 It.c.. co I t i i1,n ' I t'i i 1 l 0 1 '-! -r 'cw t. V :wLjat i on 27,

KA~t. . ,, ... A. tu,l i.,-dI{lLAR, I). E. (195)). (a<lcul1; tion ot n.li-square

to ttAt the n 110 '-factr int.eractiW:i h-ypothio ;. Bic'etrius 15,

1, pp. !{}7-115.

KULLBACK, .. (1959) . .lnirriltioii Theory and Statistics. John Wiley and

Sons, N,..J Yori:.

(UPPER.MX', M. (1959). A zapid sidnificance test for contingency tables.

Biom,-trics 15, 4, pp. 625-628.

2

NASS, C. A. C. (1959). The X2 test for small expectations in contingency

tables, with spcl.it reference to accidents and absenteeism. Bio-

metriki 46, pp. 5-385.

SILVEY, S. D. (1959). Tht2 Lagrangtan multiplier test. Ann. Math. Statist.

30, 2, pp. 389-407.'k/-,

SO:11RS, P. H. (1959). "fire rank anialoguie ot product-moment partial correla-

tion and regresswm, with appl.icatio: to m:anifold, ordered contingency

tab I s. 6io:7etri 4(,, pp. 241-246

2STIYN, H. S. (1959). ti X -tests for con:ingency tables of negative

binomial type. !'t at is tici Neeriandici 13, pp. 433-444.

I.l'X'n, [. i. (1959). A note of the use of Mod's likelihood ratio test

for itte in maly;.e; in.'oi.ving 2x2 tab1.r: with small samples.

Pz-ni._1r i !, 2' , ',, pp. 371 372.

195F'

I,L'.LOCU. it. M., Jr. (19',S) Probabilistic intcrpretations for the mnean

square co tir,,ency. J. Amer. Statist. Assoc. 53, pp. 102-105.

74

Page 76: The Information in Contingency Tables - DTIC

iLAST';i' tAI:M, M. A. -6 . K-;L~ iimi t inn (i i(-At j v r,1'i,(!1 w ijcio; of four

Spi I ,,tYpii. in :n,)oph L! ~ t .L' Lr . ~ i lC 14, 2, pp. 223-228.

M I IT A, S . .(1958) '5 nh the- I .ii i ii;-; ptlwr tuncLin )f theC frequency chi-

squ' LIarI- t C; 1 : . 'IL I I. St 1t t[:. 29 , pp. 7_21 1233.

SNI'. DC Ox 1,' ,1~ G(i i- ju ark of ihir t t-t , Mooud ajnt Lancas te r in a

23cot tingency Bil . ioiw tr ics 14, /4, pp. 560-56'. (Query).

1957

BROSS, 1. 1). J. and .AS3TN, E. L. (1957). Rapid ;inalysis of 2x2 tables.

J. A:.er. Statit . Assoc. 52, pp. 18-28L.

CORSTEN, L. C. A. (13)57). PartiLtion at. e-xperrmeual vectors connected

with inultinoinial "distributio';. BiomeLvics 13, 4, pp. 451-484.

ED2.,APJ)'S, J. H. (1957) . A note oil the praict. ical ji ;;erpretation of 2x2

tablIe.,;. grit. J1. Prey. 'Soc. me d. 11, pp. 73-138.

LANCASTP,, If. 0. (1957). Somp propertius of the bivariate normal distri-

b~ution considered in the forin of a contin,,ency table. Biorletrika

44, pi. 289-292.

:JT3T-., V. L. (1957) . Ainvost i :;at inn of the effec-t: of misc lass if ication

of thie cil-sqta; e tests in the- aiialy!ais of ecitog .oricall data.

unipub 1 ished ?11.D1. liisser ration, '.u rt~h Carol i::;j State Col1loe_, Raleigh,

INor tai Carli (I T o I il-,( I a te to ;S i.at i a t i cs 'lneo Serios No. 1S2).

ROY, S. N. (L1457). So~i Aam'wcts~ ci Multiv~iriai.i~ic John Wiley and

SAKODA, J. M. and CWil'E, B. if. (1957). Elxact prohibilities for contingency

tab le.; i; i g b Lnu:'Ij 11 ccon1 iicicnts . P~c~m ia 22, 1, pp. 83-86.

WOOLF~', B. (1957). Tiie loag likelihood rat-io too!-t ( tbe C-Les t) . Methods

and( tI Oh IC!, t or e; of he Lto ogle i t.y !I conI. ii go ncy tan Ic: ; Annals

75

Page 77: The Information in Contingency Tables - DTIC

1,~::~ J . A. (19')6) , A n( oi Jnkill:; :'A)o ue 1- ''( o-1 li (r 1 tr. ICitor ic

r * j :at i k 1 20i, 3 p P VI

W D, . J. (I)ih . I th :;I t 7,::1t i .o j 11 !; li- 1 ro l, ac i e; 0 ;I t ir oncy

GR~iWL\A N. 1'.(.19h). "loc.u Suprmnt 1 ~ l 1to. ,2

pp. 106-112.

IENIR . K. anid Fi'NNLY, D. . (19)-6) A- ext e w; ron of tb u., o)f t iiK2 tt.A;)1 . ta t . , 2, pp. 132-130.

1). , LRRER, L. an -ILFE, ~.(19sb) . Statisticai tables

for tlsa- wi d b inoia sany'les - contingenry tpsts, confidence limits,

an s)~t S 1> L T t iiats . Ncw York University Cullege of >eiie

N ew 'iork.

ROY N. s. ind VA0,'EN'.AUM, M. A. (19'),3) . Othhyohsiofno "inter-

actioi." ill a r:ult iway cont intigoncy tile . Anni ti. Staitior . 27,

3, pp. 749-73/.-

Pfl)Y , 1;. N. and I TPRA, S . K . (195)6) Au itrdt.c to so-me non-parTaletric

gtO .~ iatic~oof aWyisOf Var -i. once andl 'ml tivalriato u-tyi

Bi o:,)etc r i a 43, Parts 3 and 4, pp. 361-376.

ATO, .S. ( 9S).is y~d [e-p'treoqtencies in coait71ilgncy

t ab 1-es .0uor ? p .]

AKI TA6 I P. (I5). c - t for I i i i . r t iid~ in V)r L oor t i ons ~n i re u n-

cbs B iomc'tLric,, 11 I pp 3 '- 3,6

AI!SE P . (9 ) . 16o 1 -; or sig i tic ii co test ; o f 2..) coir' in.,ttiy tab leS.

1i 21e't. v ia 4?f , pp. 49)' 50'j

COCHRAN, W. G. (95)A ti t 0 a lineatr funct ico:i ol ti- dc~' at iionls buetvlen

()5501V((1 Iuld i'>40, te1 nmber:; . .. l. St-!t it .0;:;''' . 1' , ). 3

i

76

Page 78: The Information in Contingency Tables - DTIC

j ~ A-I. prb,~ in -- (E f ;,

I iALIDA 1, . .. '. Iii' rip i d< -1 uiat i i I a, a L~c;t of homo-

JENI ,U.L .(E,).Ail itiinveii V)t iLOxlotr rv cI.:iaric r. -P ychoretrika

20, , pp.

KA S'i r *1. A. (>~.:Ai~IysQi ol Oi in~ iili cont inge ncy tables.

CnpulIlIiled u Iw:: al clan-S(ir at ula. North (il aStLo College,

0 tnbd r V)55.

LESLIEK, P. ii. ( 19Y)). A si ,iple methoid of cairtilat: 1 .,g the exact probability

in W x2Comtilr L *y tibu a euiith smli mars ii t1.IS B a ooa~ ti:ctLr ika 42,

M IT PU\ , S . (1I). Cont rioti ius to(- t-he statist ira analysis of categor-

Hral Mt. 7"r" BC ICjl Wa 111LILILt. of StatL it .icsiMIlagal Series

.a. l142, IX'c lor 1955.

RU1l, , : . N. ind KAi ":-! , M. \. (J ) . A ;vnr r.i! i, itian of analysis of

var .~ a A ~ t:i :1 ri ~ I:,t a y,,i.; t.o a t a B. ,i~ tu I n Ireq1 uel~ i s in

qc~l~tativ%I: ''L s ihterVIa. NrnCrin lnstit-utc

of St at is~ Vt n .: ni,ipi. 131, J11, PI'

ROY, .1. "':. and : I *:A , . .!- A: iiiirot ctji to soi-e flan-parajmltric

avw a! aon ;' wl wA-q1 A I ari inco Wfl Wit LVWrit v 10. ysis.

i'u11tl ( i I I ll I i t it t s , t ti!; t ::i i-L:.& i i i i Arriet So . 139,

S 6KA iC, C. C . , A',', II VAl.. .,,S 1, (: ILkI o U SIY , P. N. 19 5o 0. it te

tU7111 f I it), .1 (,1 t. 1 1 if i c icelr Iw- tiw d! ft'f itQ betvzeeli

twot prop )rt i w:. 1!11 I'.a j ,, I p

77

Page 79: The Information in Contingency Tables - DTIC

SIAM L E . ( 1915) . A t It hW hit' y 0 1 k!, i I : Lit ii.;

ill .I t w.s-w'ly cl '' t )~ , L , . S

I. (I i. .') O l t Cii-: A ,: ruhi' f

1AC' . (19TV). A iL ,' t it !i 1itith (i f :>u n:t: f: proua-

bilit im~ Lost t" A- SiL LA W~: 1010S. W-;;;r Lik' 4) p;" 401-411.

YATES , F. (19S) . jW e 01 o tyasm. raLioi.: an rllxi ::ium lil' I ihnod ill

theo aIw!'SL s f Itiliitj1. cxrl..aillvoui.j11 t-) treatLMenLS.

B i cTm, , i- a 4 2 , pp 3 2 - 40 3

BROSS , 1. D. J . (1954) . Milsclazi;s i ficatioa in 2x2 t.abiis. h'ivm-rtrics

10 , 4 , p p. 4 7 8-4,

COCHIRAN, 1;. G. (1954) . Some mtl;foL st rei~jhuzniin. the co=-ion chi-

square teouts. B seretrics 10, 4, Pp. 114.

i)AI.'SON ...... (1054) .A SimpIt i. ,.r;h'. r the vairianrr (if tho

GOJ:-%:-, . A. and WPiKA . I I. (1 ) 54) S 0~s J S aS (e C~ i i J )1t 0 r

c ros s c 1 s s if ic t I o n. J . Acic r. Stat ign . A-"o . App. 7 Q- %4.

KIM.\LA.0.(1i54) . Short-cut Iorvi:Ias; foi ill ex-ac~t patrt it :,, ot

cni-sqnare in 'ont ngcnicy tables0. bliomet is LU,' pp. !)?-M5.

nt i.~: H, 2, pit. 91-lbc.

COA ANV.G. (1952) T hu X 2O~ L I goo~dniw~- of Hr. Attn.-Mth. Motist.

23, '3, pp. 3535

78

Page 80: The Information in Contingency Tables - DTIC

!AXN( A > .PQ. .0 -,q~)V x, coL: .2llK Lil. Lro-itc(l 1y t ue

p tIi 1; V .l't .iP ~ pp. 242-249.

S 1 -TI5Ai0 (ill). Ti itrr intir .~iilr in ci i-in.ecncy

19 50

TOIIK K i .(P ). xt l~ol thbe *yn-Pii;r1teory of tests to

di~C~llt~llli('L. iIIA~l orieL ri k.i 1 pp. I1 0-14/.

HISU, P. k (191)) . Me I iiLt inr distribti onn of I: nt ions of samp:le means

Sy,1 i 'it I '2,It j .;'f'nr- ; ,Jj~ Pr ,~ j *,> (110i

!wJ . (- (1 )) . '' ,o te o n tL he s Aub dI i S'IUi I b M - s qIl r I aaA

L,...i 1> 'i . (. ''i) e r 1vat- i onl and par t i "i ni Mtl -' l.j iT

(' i ;r s~ LI ibrit i 11115. kit utri 4" y. OViID.

-1906

ATS I. (19'..) 1.. .nlyi; of cont 11 .-IVV '1 :0 w i Iii roip fn,s based

794

Page 81: The Information in Contingency Tables - DTIC

C:' ttAr[it , 't i K,, 1! 1], . ,1L + 12+ , [t I : S!'<Z r 1[iI

'I c' ' 11 i v c: r++ ]'

I ii-! .,

sit y P'r>, p.:'

19 !>

NORTON , H. W. (1 4 5). Clculation i cii-square for ctp le:x cun:inge i;y

[dUes. J. Amer. nItv;. A~sn. 40, pp. 251-258.

1937

HALIDA., J . f. S. (19 3). T t i CL vatlue of tW .o:,-nts of Ace distribu-2

tion of X , ed as ci ta t of IdnesA; of fit, when exponit dion aro

smal]. Biclatrika 29, pp. 133-143.

19 35

I T m. S. (I ) ont i ncy t i.,

; c. +< +iiqi++ _ 2, 'jy . !4::;-2: - 2.

I:t ,1 . AriA. r"art. t i t , yp . 190-- 1 u

19 Y 4

tLV', 2',. A. (1i i4) . 01 , I - andi Io y 2 ;' d ' ., -; ,, V i:i rli

. i l t .rti A, j[i,: t. A t' . I A :. (II V' r ,inJ tIOcy.! IA . , 3<cc 1 icci 2 t.22i.

2Y A. 'F' , f. . ( I') .',) .:I(t : y' II t 1 i ! v : I v i 1 , + :. I I tu ' U .-, a .i , tJ ' , 2

t, t y I ;< . P: i 'L._ K.nc_. - ,~ ,i '.,; ! ,. 2 . IS,.

SO+

Page 82: The Information in Contingency Tables - DTIC

FIS HER, H. A. (1924). 'ide cono a it. ion s i in cn: whr: I L i_ 1 -e e -!u ures the

discrepancy betwcc [erv: ion ao hyp , !es i . J. hoy. Stati.St. Soc.

37, pp. 442-450.

1922

FISHER, R. A. (1922). On the lnfterr-cetatioa of cl:i-sqna-e from contingency

tables, and the calculation of P. J. Roy. ?tatist. SoCe. 85, pp.

87-94.

1900

PEARSON, K. (1900). On the criterion that a given systLem of deviations

from the probablc in the case of a correlated system of variables is

such th:t it can be reasonably suppcsed to have arisen from random

Ph' .. eris 5 5,, pp. 157-172.

81

Page 83: The Information in Contingency Tables - DTIC

SEC. LIPITY CL &5sri rI A v!I. Ci i ,s P ,,,E (when' !DM* a lt ,.*.

-. -- RE D rSTRUCTIONSREPORT DOCU~MENTATION PAGE86FfUE CO!MPL ET11G FOR~M

I IMPORT w.UMKEft Gr.~Y AI-C9113Ic' NO . ATI ,. ALOC, HdUMBER

4 TITLE (wd S.btlll.) frPRO OEE

A ~ ~ ~ ~ ~ ~ ~ ~ ; =Nolet icRIt:P~~- - iMhOGfepop~urmusef

t~c~ 'he - -ZON1'*AC' Oft GRANT NUMSERI'.)

C T.Treia:~ and Kul Iw ib i

9 0

RFAMNGOAANZAri,.NAM,' AND ADDRESS .1 RG LMN.P EC T TASK

De2partunerc f' 3tt K

' tanfc)rd. ti _L.siy

nfc Ca 1f'.ItCONTROLLING COFIPCE NAME: AND ADDRESS

,it s ce -f Navaa Ae- UFh / .7 1i976'tA-tistics F& lir,-bs, l'i Trgram ryje i3

A-iing!, (cm, Virgira ' 114~ MONITORING AGENCY NAME & AD'pR ESSWl difftlont Itoar Coe.teolie' OffiO.) 1$. SE;tURITY CLASS. (of Lhis report)

UncLassified

(j L 'e1 SCHEDULE

16. D)ISIRISUTteON STATEMENT (of (hie Report) -

App-ruvd f") D ' nnles;Listri u ., r

7. DSTRIBUTION. ST ATEMENT tof rho ab.1-aPot ... d 1, gl~ck 20, it dW~rre..( bo, Rooort)

18 SUPPLEMENTARY NO'kS

19 KEY *0111o0 (coo..* olsd. It nor... ry mora lidenlfit by, block . s*hsr

corit 1 ng~ecy ti.bies~

20 ABSTRACY (17-otnue n ravrt.. .Id. of necssr Ard ftloIli b b le *41 e .)

D D 0A,", 14A73 1lTIO Of I kOV 6012 OBSOLRYIv

19CURITY CLAMIFICATIOM OW THIS PAOE (Whm b ate .6.e

Page 84: The Information in Contingency Tables - DTIC

SECURI CL.A$S ,'A r-N OF T ¢'. PA.. (I.,, D . rd

20.-- rhe analysis of the intu-mation in contingency tables is an as- ctof maltivariate (mulviple variates) analysis wlth particular applicationto qualitative or c:cg.rical as well as quantitative variables.

"The analysis is e'rieerned with couwts in muiltiway cross-cla.sificationsor multiway contingen/. tables. Multiway contir.gency tables, or cross-classifications of' vet rs of discrete rm.ndom vmriables, provide a u-efulapproach to the analysis of multivariaie discrete3 data.

'The method of analysis presented will bring out the various inter-relationships among the classificatory variables in a multivay cross-classification or conting-ency table in many dimensions.

The proc-2dure is basei on the Principle of Minimum DiscriminationInformation Estimaticni, associated statistics and Analysis of Informa-tion. Genei-ul cxmnputer programs are available to provide the necessaryresults for :Inference. An analysis of a four-way contingency table ispr.?sented for tilustraticn of these techniques.

#235

UNCLAfIF TEDSECUft9TY CiLAIICAYISW OF TMII pAageW(IS Dee. Ente.4