Page 1
Introduction to
Mathematica,
for Statisticians
Contents:
Ken’s introductory slides (6 pages)Accompanying notebook file (3 pages)
Fotios’s notebook file, with more detailedexamples (3 pages)
Becky’s notes on her example (3 pages)and accompanying graph (1 page)
Ken’s example (5 pages)
…and some bonus material!
Richard’s slides (4 pages) and notebook file(2 pages) for his unseen example
Page 2
Mathematica
information sharing session
11th November 2003
Ken + Becky, Fotios, Richard
Page 3
‘How to’ Mathematica
• What does Mathematica do?– symbolic maths manipulation (algebra)
– highly accurate numerical work
– fancy graphics
• What doesn’t it do?– statistical ‘tools’; residuals, R2, anova
– nice data manipulation
– nice data input/output
1/5
Page 4
Syntax
• Use <shift+return> to evaluateexpressions
• Generally as close as possible to ‘real’maths
• Objects/functions work similarly to R/S
• Functions have sensible names
• Everything stored in ‘notebook’ files(.nb); input, output, annotation, pictures
2/5
Page 5
Syntactical details
• Functions start with capital letters, Abs[-22]
use [ ] for function arguments N[Sqrt[2]]
• Use = for assignment y = 10
• NewFunction[x_] := MyFunc[x_]:= x^2
• ‘Evaluate at’; expression /. x->x0 MyFunc[10]
MyFunc[y]/.y->10
• Use { } for lists/vectors/matrices Plot[y,{y,0,10}]
ylist={0,2,25}
• == represents equality (like R/S) Solve[x^2==4,x]
• Use ( ) intuitively, stops confusion z^2/.z->10+5
3/5
Page 6
A real example
• CHI problem:hospitals’ deathrates Y
ifollow
• Three unknownparameters, plushave to make a good guess for F
FN
NY
i
iii
εσµεµ
σµ
+− ),()1(~
),(~
2
00
2
Yi
µi
Fµ0 , σ0ε
σi
Zi µiZ=0
µiZ=1
4/5
Page 7
Other useful things
• Good tutorial + help system (master index)
• Add-ons for extra functionality - MathStatica
• Batch mode for e.g. simulations
• Abort: <Alt + .>
• Numerical iteration methods not perfect –
just like other packages
• Not foolproof (integration)
• Limited number of licenses!
5/5
Page 8
Examples of Mathematica syntax Use <Shift+Return> to evaluate expressions;
1 + 1
2
2^502^-50
1125899906842624
1�������������������������������������������������
1125899906842624
Functions start with capital letters, and generally have sensible names;
Abs@-22D22
Many functions have default settings, usually ignored, as in R
N@Sqrt@2DDN@Sqrt@2D, 40D1.41421
1.414213562373095048801688724209698078570
Assign values using =A semi−colon afterwards stops output
y = 10
10
y = 10;
y^2N@Sqrt@yDD100
3.16228
Defining new functions, and evaluating expressions at particular values
MyFunc@x_D := x^2
MyFunc@10D100
MyFunc@yD100
Page 9
z^2 �. z ® 10MyFunc@zD �. z ® 10z
100
100
z
Curly brackets are used for lists − includes vectors, matrices, also ranges
Plot@MyFunc@zD, 8z, -5, 10<D
-4 -2 2 4 6 8 10
10
20
30
40
50
60
� Graphics �
avec = 8a1, a2<;amatrix = 881, 2<, 82, 1<<;[email protected]
J 1 22 1
N
8a1 + 2 a2, 2 a1 + a2<
Use � for equality;
Solve@x^2 � 4, xD88x ® -2<, 8x ® 2<<
bmatrix = Table@If@i � j, 2, 1D, 8i, 1, 2<, 8j, 1, 2<DMatrixForm@bmatrixDGeneral::spell1 :
Possible spelling error: new symbol name "bmatrix" is similar to existing symbol "amatrix".
882, 1<, 81, 2<<
J 2 11 2
N
Regular brackets ( ) act in the normal way
Expand@Hx + 2L^10, xD1024 + 5120 x + 11520 x2 + 15360 x3 + 13440 x4 + 8064 x5 + 3360 x6 + 960 x7 + 180 x8 + 20 x9 + x10
Page 10
z^2 �. z ® 10 + 5Hz^2 �. z ® 10L + 5
225
105
Other commands of general interest
Simplify@1024 + 5120 x + 11520 x2 + 15360 x3 + 13440 x4 + 8064 x5 + 3360 x6 + 960 x7 + 180 x8 + 20 x9 + x10DH2 + xL10
TeXForm@MatrixForm@bmatrixDD\matrix{ 2 & 1 \cr 1 & 2 \cr }
Simplify@Log@8D�Log@2DDFullSimplify@Log@8D�Log@2DDLog@8D��������������������
Log@2D3
Plot3D@Cos@xD Sin@yD, 8x, 0, 2 Pi<, 8y, 0, 2 Pi<D;
0
2
4
6 0
2
4
6
-1-0.5
00.51
0
2
4
6
FindMinimum@1 + x^2, 8x, -3<D81., 8x ® 0.<<<<NumericalMath‘NMinimize‘NMinimize[1+x^2,x]NMinimize[{1+x^2,x>2},x]
81., 8x ® 0.<<NMinimize::strong : Strong inequality has been changed to a weak inequality.
85., 8x ® 2.<<
Page 11
Vector + Matrices
w = 81, 2, 3<81, 2, 3<q = 881, 2<, 83, 4<<881, 2<, 83, 4<<
Import data from .txt file
chdmm = Import@"CHDMM.txt", "Table"D88"AGEGRP", "GRLEV", "GRLUMP", "SEX", "S", "T", "I1", "I2", "I3", "I4", "I5"<,84, 1, 1, 1, 3520, 4254, 1, 0, 0, 0, 0<, 84, 4, 2, 1, 0, 701, 1, 0, 0, 0, 0<,84, 2, 1, 1, 2193, 2763, 1, 0, 0, 0, 0<, 84, 6, 3, 1, 0, 988, 1, 0, 0, 0, 0<,84, 4, 2, 1, 2146, 3286, 1, 0, 0, 0, 0<, 83, 6, 3, 2, 3346, 3413, 1, 0, 0, 0, 0<,84, 5, 2, 1, 2486, 2584, 1, 0, 0, 0, 0<, 83, 6, 3, 2, 3313, 3652, 1, 0, 0, 0, 0<,84, 6, 3, 1, 3472, 3484, 1, 0, 0, 0, 0<, 83, 1, 1, 1, 2530, 2782, 1, 0, 0, 0, 0<,
Access data
w@@2DD2
q@@2, 2DD4
Part@chdmm, 1, 1DAGEGRP
Extract@chdmm, 1D8AGEGRP, GRLEV, GRLUMP, SEX, S, T, I1, I2, I3, I4, I5<Extract@chdmm, 100D84, 1, 1, 1, 825, 5026, 0, 1, 0, 0, 0<chdmm@@1, 1DDAGEGRP
Derivation
Page 12
¶x Hx2 + 5 xL5 + 2 x
H#2 + 5 # &L’@xD5 + 2 x
f@x_D := x2 + 5 x
f’@xD5 + 2 x
Derivative@1D@fD@xD5 + 2 x
f’’@xD2
Derivative@2D@fD@xD2
g@x_, y_, z_D := x3 + y2 + z2
g’@xDg¢@xDDerivative@1, 0, 0D@gD@x, y, zD3 x2
Integration
à0
100
H5 t4 + 2 tL ât
10000010000
Integrate@5 t4 + 2 t, 8t, 0, 100<D10000010000
à H5 t4 + 2 tL ât
t2 + t5
Page 13
Integrate@5 t4 + 2 t, tDt2 + t5
Double Integral
à0
10
à0
10
Hx3 + y3L âx ây
50000
Integrate@x3 + y3, x, yDx4 y������������
4+
x y4������������
4
Plots
Plot@x2, 8x, 0, 100<D
20 40 60 80 100
2000
4000
6000
8000
10000
� Graphics �
Plot3D@x2 + y2, 8x, 0, 10<, 8y, 0, 10<D
02
46
8100
2
4
6
8
10
050
100150200
02
46
8
� SurfaceGraphics �
Page 14
Background
♦ Cluster randomised trials randomise groups of individuals
together.
♦ To allow for correlation within clusters, sample size must
be inflated to retain adequate power.
♦ Power of a cluster randomised trial depends on the
intracluster correlation coefficient (ICC).
Power of a planned cluster randomised trial
Traditional to calculate power using an available ICC estimate
ρ :
( ) ( ){ }
−Φ−
−+Φ= −
21
ˆ114ˆ 1 α
ρδρ
n
nkPower
where n = planned cluster size
k = planned number of clusters
We want to calculate power while allowing for the imprecision
in ρ :
( ) ( ){ }
−Φ−
−+Φ= −
21
114
1 α
ρδρ
n
nkPower
♦ Using information about the data from which ρ was
obtained, we construct a distribution for the true ICC ρ
♦ This gives a distribution for ( )ρPower
Page 15
Examining the distribution for power
Useful to look at mean power, which is proportional to:
( ){ } ( ) dyyyn
nk
−−∫
−Φ−
−+Φ 2
22
1
0
ˆˆ2
1exp
ˆ2
1
21
114ρ
σσπ
αδ
Defining the double integral, in stages:
xintegrand = Exp[-x^2/2]/Sqrt[2Pi]
yintegrand = Exp[-(y - rhohat)^2/(2 sigmasq)]/
Sqrt[2Pi sigmasq]
phi[x_] := Sqrt[1/(2 Pi)] Integrate[Exp[
y^2)/2], {y, -Infinity, x}]
constant = 1/(phi[(1-rhohat)/Sqrt[sigmasq]] -
phi[-rhohat/Sqrt[sigmasq]])
Combining to obtain expression for mean power:
meanpower = constant Integrate[xintegrand
yintegrand, {y,0,1}, {x,-Infinity,((delta
Sqrt[(n k)/4])/Sqrt[1 + (n - 1)y]) - phisig}]
The integral can’t be evaluated analytically.
But Mathematica will give numerical values for mean power if
we provide values for the constants ρ , 2σ , δ, n and k.
Page 16
Example
♦ Designing a cluster randomised trial to detect a difference
of 0.4 standard deviations at a two-sided 5% significance level.
♦ Using a relevant ICC estimate ρ =0.025, with variance2
σ =0.00089.
♦ What should we choose for number of clusters n and cluster
size k?
Evaluating mean power numerically for n = 20, k = 15:
N[meanpower /. {rhohat->0.025,sigmasq->0.00089,
delta->0.4,phisig->1.96,n->20,k->15}]
Gives 77% for mean power (compared with 81% when not
allowing for imprecision in ρ ).
Rewriting this as a function for general values of n and k:
meanpowerfunc[nx_, kx_] := N[studypower /.
{rhohat->0.025,sigmasq->0.00089,delta-
>0.4,phisig->1.96,n->nx,k->kx}]
Plotting mean power against [ ]30,10∈n and [ ]25,5∈k :
Plot3D[meanpowerfunc[n,k], {n,10,30}, {k,5,25}]
Page 17
10
15
20
25
30 5
10
15
200.4
0.6
0.8
10
15
20
25
Page 18
A real example
Trying to get robust estimates of a random−effects distribution;We assume that the random effects density has a N(mu,sigma^2) ‘middle’, but exponential tails beyond ksigma.
In the tail and middle respectively, the likelihood is proportional to;
fhg1 = Exp@-Abs@x - muD k�sigma + k^2�2Dfhg2 = Exp@-Hx - muL^2�2�sigma^2DPlot@HIf@Abs@xD > k, Evaluate@fhg1D, Evaluate@fhg2DD �. 8mu ® 0, sigma ® 1, k ® 0.5<L,8x, -10, 10<, PlotRange ® 80, 1<D;ã
k2��������2 -
k Abs@-mu+xD��������������������������������sigma
ã
-H-mu+xL2������������������������2 sigma2
-10 -5 5 10
0.2
0.4
0.6
0.8
1
Integrate these over the appropriate ranges to get the normalising factor for the density;
int1 = Simplify@Integrate@Simplify@fhg1, x < muD, 8x, -Infinity, mu - k sigma<D, 8k > 0, sigma > 0<D;
int2 = Simplify@Integrate@fhg2, 8x, mu - k sigma, mu + k sigma<D, 8k > 0, sigma > 0<D;int3 = Simplify@
Integrate@Simplify@fhg1, x > muD, 8x, mu + k sigma, Infinity<D, 8k > 0, sigma > 0<D;normfactor = Simplify@1�Hint1 + int2 + int3LD
k��������������������������������������������������������������������������������������������������
2 ã-k2��������2 sigma + k �!!!!!!!!2 Π sigma ErfA k
���������!!!!!2 E
Write the random effects distribution as (1−epsilon)N(mu,sigma^2) + epsilon FWhat does the ‘contaminant’ F look like, for some sensible parameter values?
epsilon = Simplify@1 - FullSimplify@normfactor Sqrt@2 Pi sigma^2D, sigma > 0DD;fdens = UnitStep@Abs@x - muD - k sigmaD Hfhg1 normfactor - fhg2 H1 - epsilonL�Sqrt@2 Pi sigma^2DL�epsilon;
Page 19
Plot@fdens �. 8mu ® 3, k ® 3�2, sigma ® 1<, 8x, -4, 10<D;
-4 -2 2 4 6 8 10
0.05
0.1
0.15
0.2
0.25
Now calculate the marginal likelihood of the data − integrate out the random effects
chiobsf = fhg2�Sqrt@2 PiD�sigma �. 8mu ® x, x ® chiobs, sigma ® chisigma<;General::spell1 :
Possible spelling error: new symbol name "chiobs" is similar to existing symbol "chiobsf".
chimarg1 = Simplify@Integrate@chiobsf normfactor Simplify@fhg1, x < muD,8x, -Infinity, mu - k sigma<D, 8chisigma > 0<D;chimarg2 = Simplify@Integrate@chiobsf normfactor fhg2,8x, mu - k sigma, mu + k sigma<D, 8chisigma > 0<D;chimarg3 = Simplify@Integrate@chiobsf normfactor Simplify@fhg1, x > muD,8x, mu + k sigma, Infinity<D, 8chisigma > 0<D;chimarg = Simplify@chimarg1 + chimarg2 + chimarg3D
-
ã
k Ichisigma2 k+2 sigma Hchiobs-mu+k sigmaLM��������������������������������������������������������������������������������������������������������������
2 sigma2 k"######Π
����2 I-1 + ErfA chisigma2 k+sigma Hchiobs-mu+k sigmaL�����������������������������������������������������������������������������!!!!!2 chisigma sigma
EM�����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������
2 sigma J�!!!!!!!!2 Π + ã
k2��������2 k РErfA k
���������!!!!!2 EN-
ã
k Ichisigma2 k+2 sigma H-chiobs+mu+k sigmaLM�����������������������������������������������������������������������������������������������������������������
2 sigma2 k"######Π
����2 I-1 + ErfA chisigma2 k+sigma H-chiobs+mu+k sigmaL�������������������������������������������������������������������������������!!!!!2 chisigma sigma
EM��������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������
2 sigma J�!!!!!!!!2 Π + ã
k2��������2 k РErfA k
���������!!!!!2 EN+
ikjjjjã
-chiobs2 +chisigma2 k2 +2 chiobs mu-mu2 +k2 sigma2��������������������������������������������������������������������������������������������������������������������������������
2 Hchisigma2 +sigma2 L kikjjjjErfA chisigma
2 k + sigma Hchiobs - mu + k sigmaL������������������������������������������������������������������������������������������������������������������!!!!2 chisigma
�!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!chisigma2 + sigma2
E +
ErfA chisigma2 k + sigma H-chiobs + mu + k sigmaL����������������������������������������������������������������������������������������������������������������������!!!!2 chisigma
�!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!chisigma2 + sigma2
Ey{zzzzy{zzzz �
ikjjj2"###################################################chisigma2 + sigma2
ikjjj2 + ã
k2��������2 k �!!!!!!!!2 РErfA k
�����������!!!!2 Ey{zzzy{zzz
Try working that out by hand!Do the integration to check that we’ve come up with a density for the obervations;
NIntegrate@Hchimarg �. 8mu ® 3, k ® 3�2, sigma ® 1, chisigma ® 2<L,8chiobs, -Infinity, Infinity<D1.
Get the data, plug it into the likelihood, and maximise using a sensible parameterisation;
Page 20
chimeans = << /homel/ken/chiprob/chimeansm.txt;chisd = << /homel/ken/chiprob/chisdm.txt;<< Graphics‘MultipleListPlot‘MultipleListPlot@Table@88chimeans@@iDD, 138 - i<, [email protected] chisd@@iDD, 0D<,8i, 1, 137<D, AxesLabel ® 8"Yobsi", "Ranking"<D;ListPlot@Transpose@8chimeans, chisd<D, AxesLabel ® 8"Yobsi", "sigi"<D;<< Graphics‘Graphics3D‘Histogram3D@Transpose@8chimeans, chisd<D,
AxesLabel ® 8"Yobsi", "sigi", " "<, ViewPoint ® 8-1, 2, 2<D;chimargall = Sum@Log@chimarg �. 8chisigma ® chisd@@iDD, chiobs ® chimeans@@iDD<D,8i, 1, 137<D;chimargall2 = chimargall �. 8k ® Exp@lkD, sigma ® Exp@lsigmaD<;mle2 = NMaximize@chimargall2, 8mu, lk, lsigma<D
3 4 5 6 7
20
40
60
80
100
120
140
Ranking
Page 21
4 5 6 7
0.15
0.2
0.25
sigi
24
68
Yobsi
00.2
0.4
0.6
0.8
sigi
0
20
40
60
24
68
Yobsi
00.2
0.4
0.6
0.8
sigi
General::spell1 : Possible spelling error: new symbol name "lsigma" is similar to existing symbol "sigma".
8-153.423, 8lk ® 0.313738, lsigma ® -0.50211, mu ® 5.35008<<
Check this is actually a max;
Page 22
Grad@fun_, vars_ListD := Map@D@fun, #D &, varsDHessian@fun_, vars_ListD := Outer@D, Grad@fun, varsD, varsDhmle2 = HHessian@chimargall2, 8mu, lk, lsigma<DL �. mle2@@2DD;estsd = Sqrt@Table@Inverse@-hmle2D@@i, iDD, 8i, 1, 3<DD;HGrad@chimargall2, 8mu, lk, lsigma<DL �. mle2@@2DDEigenvalues@hmle2DGeneral::spell1 : Possible spelling error: new symbol name "Grad" is similar to existing symbol "Grid".
General::spell1 : Possible spelling error: new symbol name "hmle2" is similar to existing symbol "mle2".
8-0.0044682, 0.000352626, -0.0000295546<8-281.944, -208.298, -15.3448<
Point estimates and confidence intervals for mu,sigma,and epsilon
mu + 1.96 8-1, 0, 1< estsd@@1DD �. mle2@@2DDExp@lsigma + 1.96 8-1, 0, 1< estsd@@3DD �. mle2@@2DDDepsilon �. k ® HExp@lk + 1.96 8-1, 0, 1< estsd@@2DD �. mle2@@2DDDL85.23152, 5.35008, 5.46864<80.473258, 0.605252, 0.774061<80.197222, 0.0542943, 0.00505818<
Page 23
Using mathematica to findthe asymptotic variance of
the estimated expected valueof a lognormal distribution
Richard Nixon
14 November 2003
Page 24
Likelihood, expected value and MLE’s
f(y) =1
y√
2πσ2exp
(
−log y − µ
2σ2
)
E[Y ] = exp(µ + σ2/2)
V[E[Y ]] = V[exp(µ + σ2/2)]
µ =1
n
n∑
i=1
log yi
σ2 =1
n
n∑
i=1
(log yi − µ)2
1
Page 25
Reparameterize
α = exp(µ) β = exp(σ2/2)
α = exp(µ) β = exp(σ2/2)
V[E[Y ]] = V[αβ]
= E[α]2V[β] + E[β]2V[α] + V[α]V[β]
If α and β are independent
2
Page 26
Log likelihood
L = −
n∑
i=1
log(yi) −n
2log(2πσ2) −
n∑
i=1
(
(log yi − µ)2
2σ2
)
= −nµ −n
2log(4π log β) −
n(
σ2 − (µ − log α)2)
4π log β
Wilks theorem
(
α
β
)
∼ N
(
α
β
)
,−
(
δ2Lδα2
δ2Lδαδβ
δ2Lδαδβ
δ2Lδβ2
)−1
∣
∣
∣
∣
∣
∣
α = eµ
β = eσ2
2
3
Page 27
à Likelihood function
l = 1�Hy*Sqrt@2*Pi*ssDL Exp@-HLog@yD - muL^2�H2 ssLD
ã-H-mu+Log@yDL2��������������������������������������2 ss
���������������������������������!!!!!!!!2 �!!!!!!!ss y
Integrate@l, 8y, 0, Infinity<, Assumptions ® 8ss > 0<D1
à Expected value
m1 = Integrate@y l, 8y, 0, Infinity<, Assumptions ® 8ss > 0<Dãmu+
ss�������2
à Log likelihood function
L = -n muhat - n�2 Log@4 Pi Log@bDD - n Hsshat + Hmuhat - Log@aDL^2L�H4 Log@bDL
-muhat n -
n Hsshat + Hmuhat - Log@aDL2L�������������������������������������������������������������������������������
4 Log@bD -
1�����
2n Log@4 Π Log@bDD
à Hessian matrix
H = FullSimplify@88D@L, 8a, 2<D, D@L, a, bD<, 8D@L, a, bD, D@L, 8b, 2<D<<D
99-
n H1 + muhat - Log@aDL����������������������������������������������������������
2 a2 Log@bD ,n H-muhat + Log@aDL�����������������������������������������������������
2 a b Log@bD2 =, 9 n H-muhat + Log@aDL�����������������������������������������������������
2 a b Log@bD2 ,
-
1����������������������������������
4 b2 Log@bD3 Hn H2 Hmuhat2
+ sshatL + H-2 + muhat2 + sshat - 2 Log@bDL Log@bD -
2 muhat Log@aD H2 + Log@bDL + Log@aD2 H2 + Log@bDLLL==
à Observed information
info =
FullSimplify@Inverse@-HD �. Log@aD ® muhat �. a ® Exp@muhatD �. Log@bD ® sshat�2 �.b ® Exp@sshat�2D D
99 ã2 muhat sshat
�����������������������������������
n, 0=, 90, ã
sshat sshat2�����������������������������������
2 n==
Page 28
à Asymptotic variance of the estimated expected value
FullSimplify@Exp@2 muhatD info@@2DD@@2DD +
Exp@sshatD info@@1DD@@1DD + info@@1DD@@1DD info@@2DD@@2DDDã2 muhat+sshat sshat Hsshat2 + n H2 + sshatLL
���������������������������������������������������������������������������������������������������������������
2 n2