Introduction to Mathematica for Statisticiansfaculty.washington.edu/kenrice/all.pdf · Trying to get robust estimates of a random−effects distribution; We assume that the random

Introduction to

Mathematica,

for Statisticians

Contents:

Ken’s introductory slides (6 pages)Accompanying notebook file (3 pages)

Fotios’s notebook file, with more detailedexamples (3 pages)

Becky’s notes on her example (3 pages)and accompanying graph (1 page)

Ken’s example (5 pages)

…and some bonus material!

Richard’s slides (4 pages) and notebook file(2 pages) for his unseen example

Mathematica

information sharing session

11th November 2003

Ken + Becky, Fotios, Richard

‘How to’ Mathematica

• What does Mathematica do?– symbolic maths manipulation (algebra)

– highly accurate numerical work

– fancy graphics

• What doesn’t it do?– statistical ‘tools’; residuals, R2, anova

– nice data manipulation

– nice data input/output

1/5

Syntax

• Use <shift+return> to evaluateexpressions

• Generally as close as possible to ‘real’maths

• Objects/functions work similarly to R/S

• Functions have sensible names

• Everything stored in ‘notebook’ files(.nb); input, output, annotation, pictures

2/5

Syntactical details

• Functions start with capital letters, Abs[-22]

use [ ] for function arguments N[Sqrt[2]]

• Use = for assignment y = 10

• NewFunction[x_] := MyFunc[x_]:= x^2

• ‘Evaluate at’; expression /. x->x0 MyFunc[10]

MyFunc[y]/.y->10

• Use { } for lists/vectors/matrices Plot[y,{y,0,10}]

ylist={0,2,25}

• == represents equality (like R/S) Solve[x^2==4,x]

• Use ( ) intuitively, stops confusion z^2/.z->10+5

3/5

A real example

• CHI problem:hospitals’ deathrates Y

ifollow

• Three unknownparameters, plushave to make a good guess for F

FN

NY

i

iii

εσµεµ

σµ

+− ),()1(~

),(~

2

00

2

Yi

µi

Fµ0 , σ0ε

σi

Zi µiZ=0

µiZ=1

4/5

Other useful things

• Good tutorial + help system (master index)

• Add-ons for extra functionality - MathStatica

• Batch mode for e.g. simulations

• Abort: <Alt + .>

• Numerical iteration methods not perfect –

just like other packages

• Not foolproof (integration)

• Limited number of licenses!

5/5

Examples of Mathematica syntax Use <Shift+Return> to evaluate expressions;

1 + 1

2

2^502^-50

1125899906842624

1��

1125899906842624

Functions start with capital letters, and generally have sensible names;

Abs@-22D22

Many functions have default settings, usually ignored, as in R

N@Sqrt@2DDN@Sqrt@2D, 40D1.41421

1.414213562373095048801688724209698078570

Assign values using =A semi−colon afterwards stops output

y = 10

10

y = 10;

y^2N@Sqrt@yDD100

3.16228

Defining new functions, and evaluating expressions at particular values

MyFunc@x_D := x^2

MyFunc@10D100

MyFunc@yD100

z^2 �. z ® 10MyFunc@zD �. z ® 10z

100

100

z

Curly brackets are used for lists − includes vectors, matrices, also ranges

Plot@MyFunc@zD, 8z, -5, 10<D

-4 -2 2 4 6 8 10

10

20

30

40

50

60

� Graphics �

avec = 8a1, a2<;amatrix = 881, 2<, 82, 1<<;[email protected]

J 1 22 1

N

8a1 + 2 a2, 2 a1 + a2<

Use � for equality;

Solve@x^2 � 4, xD88x ® -2<, 8x ® 2<<

bmatrix = Table@If@i � j, 2, 1D, 8i, 1, 2<, 8j, 1, 2<DMatrixForm@bmatrixDGeneral::spell1 :

Possible spelling error: new symbol name "bmatrix" is similar to existing symbol "amatrix".

882, 1<, 81, 2<<

J 2 11 2

N

Regular brackets ( ) act in the normal way

Expand@Hx + 2L^10, xD1024 + 5120 x + 11520 x2 + 15360 x3 + 13440 x4 + 8064 x5 + 3360 x6 + 960 x7 + 180 x8 + 20 x9 + x10

z^2 �. z ® 10 + 5Hz^2 �. z ® 10L + 5

225

105

Other commands of general interest

Simplify@1024 + 5120 x + 11520 x2 + 15360 x3 + 13440 x4 + 8064 x5 + 3360 x6 + 960 x7 + 180 x8 + 20 x9 + x10DH2 + xL10

TeXForm@MatrixForm@bmatrixDD\matrix{ 2 & 1 \cr 1 & 2 \cr }

Simplify@Log@8D�Log@2DDFullSimplify@Log@8D�Log@2DDLog@8D��

Log@2D3

Plot3D@Cos@xD Sin@yD, 8x, 0, 2 Pi<, 8y, 0, 2 Pi<D;

0

2

4

6 0

2

4

6

-1-0.5

00.51

0

2

4

6

FindMinimum@1 + x^2, 8x, -3<D81., 8x ® 0.<<<<NumericalMath‘NMinimize‘NMinimize[1+x^2,x]NMinimize[{1+x^2,x>2},x]

81., 8x ® 0.<<NMinimize::strong : Strong inequality has been changed to a weak inequality.

85., 8x ® 2.<<

Vector + Matrices

w = 81, 2, 3<81, 2, 3<q = 881, 2<, 83, 4<<881, 2<, 83, 4<<

Import data from .txt file

chdmm = Import@"CHDMM.txt", "Table"D88"AGEGRP", "GRLEV", "GRLUMP", "SEX", "S", "T", "I1", "I2", "I3", "I4", "I5"<,84, 1, 1, 1, 3520, 4254, 1, 0, 0, 0, 0<, 84, 4, 2, 1, 0, 701, 1, 0, 0, 0, 0<,84, 2, 1, 1, 2193, 2763, 1, 0, 0, 0, 0<, 84, 6, 3, 1, 0, 988, 1, 0, 0, 0, 0<,84, 4, 2, 1, 2146, 3286, 1, 0, 0, 0, 0<, 83, 6, 3, 2, 3346, 3413, 1, 0, 0, 0, 0<,84, 5, 2, 1, 2486, 2584, 1, 0, 0, 0, 0<, 83, 6, 3, 2, 3313, 3652, 1, 0, 0, 0, 0<,84, 6, 3, 1, 3472, 3484, 1, 0, 0, 0, 0<, 83, 1, 1, 1, 2530, 2782, 1, 0, 0, 0, 0<,

Access data

w@@2DD2

q@@2, 2DD4

Part@chdmm, 1, 1DAGEGRP

Extract@chdmm, 1D8AGEGRP, GRLEV, GRLUMP, SEX, S, T, I1, I2, I3, I4, I5<Extract@chdmm, 100D84, 1, 1, 1, 825, 5026, 0, 1, 0, 0, 0<chdmm@@1, 1DDAGEGRP

Derivation

¶x Hx2 + 5 xL5 + 2 x

H#2 + 5 # &L’@xD5 + 2 x

f@x_D := x2 + 5 x

f’@xD5 + 2 x

Derivative@1D@fD@xD5 + 2 x

f’’@xD2

Derivative@2D@fD@xD2

g@x_, y_, z_D := x3 + y2 + z2

g’@xDg¢@xDDerivative@1, 0, 0D@gD@x, y, zD3 x2

Integration

à0

100

H5 t4 + 2 tL ât

10000010000

Integrate@5 t4 + 2 t, 8t, 0, 100<D10000010000

à H5 t4 + 2 tL ât

t2 + t5

Integrate@5 t4 + 2 t, tDt2 + t5

Double Integral

à0

10

à0

10

Hx3 + y3L âx ây

50000

Integrate@x3 + y3, x, yDx4 y��

4+

x y4��

4

Plots

Plot@x2, 8x, 0, 100<D

20 40 60 80 100

2000

4000

6000

8000

10000

� Graphics �

Plot3D@x2 + y2, 8x, 0, 10<, 8y, 0, 10<D

02

46

8100

2

4

6

8

10

050

100150200

02

46

8

� SurfaceGraphics �

Background

♦ Cluster randomised trials randomise groups of individuals

together.

♦ To allow for correlation within clusters, sample size must

be inflated to retain adequate power.

♦ Power of a cluster randomised trial depends on the

intracluster correlation coefficient (ICC).

Power of a planned cluster randomised trial

Traditional to calculate power using an available ICC estimate

ρ :

( ) ( ){ }

−Φ−

−+Φ= −

21

ˆ114ˆ 1 α

ρδρ

n

nkPower

where n = planned cluster size

k = planned number of clusters

We want to calculate power while allowing for the imprecision

in ρ :

( ) ( ){ }

−Φ−

−+Φ= −

21

114

1 α

ρδρ

n

nkPower

♦ Using information about the data from which ρ was

obtained, we construct a distribution for the true ICC ρ

♦ This gives a distribution for ( )ρPower

Examining the distribution for power

Useful to look at mean power, which is proportional to:

( ){ } ( ) dyyyn

nk

−−∫

−Φ−

−+Φ 2

22

1

0

ˆˆ2

1exp

ˆ2

1

21

114ρ

σσπ

αδ

Defining the double integral, in stages:

xintegrand = Exp[-x^2/2]/Sqrt[2Pi]

yintegrand = Exp[-(y - rhohat)^2/(2 sigmasq)]/

Sqrt[2Pi sigmasq]

phi[x_] := Sqrt[1/(2 Pi)] Integrate[Exp[

y^2)/2], {y, -Infinity, x}]

constant = 1/(phi[(1-rhohat)/Sqrt[sigmasq]] -

phi[-rhohat/Sqrt[sigmasq]])

Combining to obtain expression for mean power:

meanpower = constant Integrate[xintegrand

yintegrand, {y,0,1}, {x,-Infinity,((delta

Sqrt[(n k)/4])/Sqrt[1 + (n - 1)y]) - phisig}]

The integral can’t be evaluated analytically.

But Mathematica will give numerical values for mean power if

we provide values for the constants ρ , 2σ , δ, n and k.

Example

♦ Designing a cluster randomised trial to detect a difference

of 0.4 standard deviations at a two-sided 5% significance level.

♦ Using a relevant ICC estimate ρ =0.025, with variance2

σ =0.00089.

♦ What should we choose for number of clusters n and cluster

size k?

Evaluating mean power numerically for n = 20, k = 15:

N[meanpower /. {rhohat->0.025,sigmasq->0.00089,

delta->0.4,phisig->1.96,n->20,k->15}]

Gives 77% for mean power (compared with 81% when not

allowing for imprecision in ρ ).

Rewriting this as a function for general values of n and k:

meanpowerfunc[nx_, kx_] := N[studypower /.

{rhohat->0.025,sigmasq->0.00089,delta-

>0.4,phisig->1.96,n->nx,k->kx}]

Plotting mean power against [ ]30,10∈n and [ ]25,5∈k :

Plot3D[meanpowerfunc[n,k], {n,10,30}, {k,5,25}]

10

15

20

25

30 5

10

15

200.4

0.6

0.8

10

15

20

25

A real example

Trying to get robust estimates of a random−effects distribution;We assume that the random effects density has a N(mu,sigma^2) ‘middle’, but exponential tails beyond ksigma.

In the tail and middle respectively, the likelihood is proportional to;

fhg1 = Exp@-Abs@x - muD k�sigma + k^2�2Dfhg2 = Exp@-Hx - muL^2�2�sigma^2DPlot@HIf@Abs@xD > k, Evaluate@fhg1D, Evaluate@fhg2DD �. 8mu ® 0, sigma ® 1, k ® 0.5<L,8x, -10, 10<, PlotRange ® 80, 1<D;ã

k2��2 -

k Abs@-mu+xD��sigma

ã

-H-mu+xL2��2 sigma2

-10 -5 5 10

0.2

0.4

0.6

0.8

1

Integrate these over the appropriate ranges to get the normalising factor for the density;

int1 = Simplify@Integrate@Simplify@fhg1, x < muD, 8x, -Infinity, mu - k sigma<D, 8k > 0, sigma > 0<D;

int2 = Simplify@Integrate@fhg2, 8x, mu - k sigma, mu + k sigma<D, 8k > 0, sigma > 0<D;int3 = Simplify@

Integrate@Simplify@fhg1, x > muD, 8x, mu + k sigma, Infinity<D, 8k > 0, sigma > 0<D;normfactor = Simplify@1�Hint1 + int2 + int3LD

k��

2 ã-k2��2 sigma + k �!!!!!!!!2 Π sigma ErfA k

��!!!!!2 E

Write the random effects distribution as (1−epsilon)N(mu,sigma^2) + epsilon FWhat does the ‘contaminant’ F look like, for some sensible parameter values?

epsilon = Simplify@1 - FullSimplify@normfactor Sqrt@2 Pi sigma^2D, sigma > 0DD;fdens = UnitStep@Abs@x - muD - k sigmaD Hfhg1 normfactor - fhg2 H1 - epsilonL�Sqrt@2 Pi sigma^2DL�epsilon;

Plot@fdens �. 8mu ® 3, k ® 3�2, sigma ® 1<, 8x, -4, 10<D;

-4 -2 2 4 6 8 10

0.05

0.1

0.15

0.2

0.25

Now calculate the marginal likelihood of the data − integrate out the random effects

chiobsf = fhg2�Sqrt@2 PiD�sigma �. 8mu ® x, x ® chiobs, sigma ® chisigma<;General::spell1 :

Possible spelling error: new symbol name "chiobs" is similar to existing symbol "chiobsf".

chimarg1 = Simplify@Integrate@chiobsf normfactor Simplify@fhg1, x < muD,8x, -Infinity, mu - k sigma<D, 8chisigma > 0<D;chimarg2 = Simplify@Integrate@chiobsf normfactor fhg2,8x, mu - k sigma, mu + k sigma<D, 8chisigma > 0<D;chimarg3 = Simplify@Integrate@chiobsf normfactor Simplify@fhg1, x > muD,8x, mu + k sigma, Infinity<D, 8chisigma > 0<D;chimarg = Simplify@chimarg1 + chimarg2 + chimarg3D

-

ã

k Ichisigma2 k+2 sigma Hchiobs-mu+k sigmaLM��

2 sigma2 k"######Π

��2 I-1 + ErfA chisigma2 k+sigma Hchiobs-mu+k sigmaL��!!!!!2 chisigma sigma

EM��

2 sigma J�!!!!!!!!2 Π + ã

k2��2 k Π ErfA k

��!!!!!2 EN-

ã

k Ichisigma2 k+2 sigma H-chiobs+mu+k sigmaLM��

2 sigma2 k"######Π

��2 I-1 + ErfA chisigma2 k+sigma H-chiobs+mu+k sigmaL��!!!!!2 chisigma sigma

EM��

2 sigma J�!!!!!!!!2 Π + ã

k2��2 k Π ErfA k

��!!!!!2 EN+

ikjjjjã

-chiobs2 +chisigma2 k2 +2 chiobs mu-mu2 +k2 sigma2��

2 Hchisigma2 +sigma2 L kikjjjjErfA chisigma

2 k + sigma Hchiobs - mu + k sigmaL��!!!!2 chisigma

�!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!chisigma2 + sigma2

E +

ErfA chisigma2 k + sigma H-chiobs + mu + k sigmaL��!!!!2 chisigma

�!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!chisigma2 + sigma2

Ey{zzzzy{zzzz �

ikjjj2"###################################################chisigma2 + sigma2

ikjjj2 + ã

k2��2 k �!!!!!!!!2 Π ErfA k

��!!!!2 Ey{zzzy{zzz

Try working that out by hand!Do the integration to check that we’ve come up with a density for the obervations;

NIntegrate@Hchimarg �. 8mu ® 3, k ® 3�2, sigma ® 1, chisigma ® 2<L,8chiobs, -Infinity, Infinity<D1.

Get the data, plug it into the likelihood, and maximise using a sensible parameterisation;

chimeans = << /homel/ken/chiprob/chimeansm.txt;chisd = << /homel/ken/chiprob/chisdm.txt;<< Graphics‘MultipleListPlot‘MultipleListPlot@Table@88chimeans@@iDD, 138 - i<, [email protected] chisd@@iDD, 0D<,8i, 1, 137<D, AxesLabel ® 8"Yobsi", "Ranking"<D;ListPlot@Transpose@8chimeans, chisd<D, AxesLabel ® 8"Yobsi", "sigi"<D;<< Graphics‘Graphics3D‘Histogram3D@Transpose@8chimeans, chisd<D,

AxesLabel ® 8"Yobsi", "sigi", " "<, ViewPoint ® 8-1, 2, 2<D;chimargall = Sum@Log@chimarg �. 8chisigma ® chisd@@iDD, chiobs ® chimeans@@iDD<D,8i, 1, 137<D;chimargall2 = chimargall �. 8k ® Exp@lkD, sigma ® Exp@lsigmaD<;mle2 = NMaximize@chimargall2, 8mu, lk, lsigma<D

3 4 5 6 7

20

40

60

80

100

120

140

Ranking

4 5 6 7

0.15

0.2

0.25

sigi

24

68

Yobsi

00.2

0.4

0.6

0.8

sigi

0

20

40

60

24

68

Yobsi

00.2

0.4

0.6

0.8

sigi

General::spell1 : Possible spelling error: new symbol name "lsigma" is similar to existing symbol "sigma".

8-153.423, 8lk ® 0.313738, lsigma ® -0.50211, mu ® 5.35008<<

Check this is actually a max;

Grad@fun_, vars_ListD := Map@D@fun, #D &, varsDHessian@fun_, vars_ListD := Outer@D, Grad@fun, varsD, varsDhmle2 = HHessian@chimargall2, 8mu, lk, lsigma<DL �. mle2@@2DD;estsd = Sqrt@Table@Inverse@-hmle2D@@i, iDD, 8i, 1, 3<DD;HGrad@chimargall2, 8mu, lk, lsigma<DL �. mle2@@2DDEigenvalues@hmle2DGeneral::spell1 : Possible spelling error: new symbol name "Grad" is similar to existing symbol "Grid".

General::spell1 : Possible spelling error: new symbol name "hmle2" is similar to existing symbol "mle2".

8-0.0044682, 0.000352626, -0.0000295546<8-281.944, -208.298, -15.3448<

Point estimates and confidence intervals for mu,sigma,and epsilon

mu + 1.96 8-1, 0, 1< estsd@@1DD �. mle2@@2DDExp@lsigma + 1.96 8-1, 0, 1< estsd@@3DD �. mle2@@2DDDepsilon �. k ® HExp@lk + 1.96 8-1, 0, 1< estsd@@2DD �. mle2@@2DDDL85.23152, 5.35008, 5.46864<80.473258, 0.605252, 0.774061<80.197222, 0.0542943, 0.00505818<

Using mathematica to findthe asymptotic variance of

the estimated expected valueof a lognormal distribution

Richard Nixon

14 November 2003

Likelihood, expected value and MLE’s

f(y) =1

y√

2πσ2exp

(

−log y − µ

2σ2

)

E[Y ] = exp(µ + σ2/2)

V[E[Y ]] = V[exp(µ + σ2/2)]

µ =1

n

n∑

i=1

log yi

σ2 =1

n

n∑

i=1

(log yi − µ)2

1

Reparameterize

α = exp(µ) β = exp(σ2/2)

α = exp(µ) β = exp(σ2/2)

V[E[Y ]] = V[αβ]

= E[α]2V[β] + E[β]2V[α] + V[α]V[β]

If α and β are independent

2

Log likelihood

L = −

n∑

i=1

log(yi) −n

2log(2πσ2) −

n∑

i=1

(

(log yi − µ)2

2σ2

)

= −nµ −n

2log(4π log β) −

n(

σ2 − (µ − log α)2)

4π log β

Wilks theorem

(

α

β

)

∼ N

(

α

β

)

,−

(

δ2Lδα2

δ2Lδαδβ

δ2Lδαδβ

δ2Lδβ2

)−1

∣

∣

∣

∣

∣

∣

α = eµ

β = eσ2

2

3

à Likelihood function

l = 1�Hy*Sqrt@2*Pi*ssDL Exp@-HLog@yD - muL^2�H2 ssLD

ã-H-mu+Log@yDL2��2 ss

��!!!!!!!!2 Π�!!!!!!!ss y

Integrate@l, 8y, 0, Infinity<, Assumptions ® 8ss > 0<D1

à Expected value

m1 = Integrate@y l, 8y, 0, Infinity<, Assumptions ® 8ss > 0<Dãmu+

ss��2

à Log likelihood function

L = -n muhat - n�2 Log@4 Pi Log@bDD - n Hsshat + Hmuhat - Log@aDL^2L�H4 Log@bDL

-muhat n -

n Hsshat + Hmuhat - Log@aDL2L��

4 Log@bD -

1��

2n Log@4 Π Log@bDD

à Hessian matrix

H = FullSimplify@88D@L, 8a, 2<D, D@L, a, bD<, 8D@L, a, bD, D@L, 8b, 2<D<<D

99-

n H1 + muhat - Log@aDL��

2 a2 Log@bD ,n H-muhat + Log@aDL��

2 a b Log@bD2 =, 9 n H-muhat + Log@aDL��

2 a b Log@bD2 ,

-

1��

4 b2 Log@bD3 Hn H2 Hmuhat2

+ sshatL + H-2 + muhat2 + sshat - 2 Log@bDL Log@bD -

2 muhat Log@aD H2 + Log@bDL + Log@aD2 H2 + Log@bDLLL==

à Observed information

info =

FullSimplify@Inverse@-HD �. Log@aD ® muhat �. a ® Exp@muhatD �. Log@bD ® sshat�2 �.b ® Exp@sshat�2D D

99 ã2 muhat sshat

��

n, 0=, 90, ã

sshat sshat2��

2 n==

à Asymptotic variance of the estimated expected value

FullSimplify@Exp@2 muhatD info@@2DD@@2DD +

Exp@sshatD info@@1DD@@1DD + info@@1DD@@1DD info@@2DD@@2DDDã2 muhat+sshat sshat Hsshat2 + n H2 + sshatLL

��

2 n2

Introduction to Mathematica for Statisticiansfaculty.washington.edu/kenrice/all.pdf · Trying to get robust estimates of a random−effects distribution; We assume that the random

Documents