Multiple Testing Methods For ChIP–Chip High Density Oligonucleotide Array Data

Multiple Testing Methods For ChIP-Chip

High Density Oligonucleotide Array Data

Sunduz KelesDepartment of Statistics and of Biostatistics & Medical Informatics

University of Wisconsin, Madison

BIRS Workshop,Statistical Science for Genome Biology

August 14-19, 2004

Sunduz Keles 1 08-18-04

Acknowledgements

Joint work with

Mark J. van der Laan, Division of Biostatistics, UC Berkeley.

Sandrine Dudoit, Division of Biostatistics, UC Berkeley.

Simon E. Cawley, Affymetrix.

Thanks to

Tom Gingeras and Stefan Bekiranov, Affymetrix.

Siew Leng Teng, Division of Biostatistics, UC Berkeley.


Outline

• Overview of ChIP-Chip experiments.

• Spatial data structure of ChIP-Chip experiments: blips.

• ChIP-Chip data for transcription factor p53.

• Multiple hypotheses testing procedures to identify blips, i.e.,bound probes.

• A model selection framework for determining the blip size.

• Application to ChIP-Chip data of tanscription factor p53.

• Conclusions and on going work.


ChIP-Chip high density oligonucleotide arraydata: a new type of genomic data

• Chromatin immunoprecipitation ChIP is a procedure for

investigating interactions between proteins and DNA. Coupled with

whole-genome DNA microarrays (Chip), it facilitates the

determination of the entire spectrum of in vivo DNA binding sites

for any given protein.

• Data structure of ChIP-Chip experiments.

(1) With two color spotted microarrays: a signal is measured for

each intergenic sequence (regulatory region) (Ren et al. (2000)),

(2) With high density oligonucleotide arrays: a signal is measured

for each probe (25mer) (Cawley et al., 2004).

• Two step analysis:

(1) Identification of bound probes, i.e., regulatory regions.

(2) Search for common regulatory motifs, i.e., exact binding

site(s), in these sequences.


ChIP-Chip experiments

1. Cross link DNA and target protein.

2. Sonicate DNA to ~1kb .

1 32

45

6

3. IP Step: Add specific antibody and immunoprecipitate.

12 3

5

4. Reverse cross links and purify DNA.

12 3 5

5. Amplify, label and hybridize to microarray.


ChIP-Chip experiments: Spatial structure-blips

A DNA fragment of ~1kb.

35bp25bp

DNA is separated from the protein and ~1kb regions are fragmented into segments of50-100bps.

Probes ordered according to their locations on the genome

Bound transcriptionfactor

The resulting fragments bind tocomplementary probes.

Figure 1: ChIP-Chip experiments. Details of the IP-enriched DNAhybridization at the probe level.


ChIP-Chip experiments: spatial structure-blips

−10

515

location 24341295

probe no

test

sta

tistic

−10

1030

location 15643916

probe no

test

sta

tistic

−10

010

20

location 15703036

probe no

test

sta

tistic

−5

5

location 11700329

probe no

test

sta

tistic

Figure 2: ChIP-Chip experiments: spatial structure. Plot of the two-sample Welch t-statistics around four different locations on chromo-some 21. x-axis: probe index.


ChIP-Chip experiments: spatial structure-blips

−10

515

location 24341295

genomic location

test

sta

tistic

−10

1030

location 15643916

genomic location

test

sta

tistic

−10

010

20

location 15703036

genomic location

test

sta

tistic

−5

5

location 11700329

genomic location

test

sta

tistic

Figure 3: ChIP-Chip experiments: spatial structure. Plot of the two-sample Welch t-statistics around four different locations on chromo-some 21. x-axis: genomic location.


ChIP-Chip experiments of Cawley et al. (2004)

• ChIP-Chip data for three transcription factors: p53, cMyc, Sp1.

• ∼ 1.1 million 25-mer probe-pairs (PM, MM), spanningnon-repeat sequences of human chromosomes 21 and 22,distributed across three Affymetrix chips.

• Target DNA samples from cell lines HCT1116 (p53) and Jurkat(cMyc, Sp1).

• Control DNA samples:

– Whole cell extraction: skip IP step (positive).

– ControlGST: bacterial antibody at IP step (negative).

• For each TF and control, there are six technical replicatesconsisting of three hybridization replicates for each of two IPreplicates.


Multiple testing procedures for identifying boundprobes

Xi,j,k: quantile normalized (Bolstad et al. (2003)) log2(PM) valueof the i-th probe in the k-th replicate of the j-th group,i ∈ {1, · · · ,∼ 1.1 million}, j ∈ {1, 2} , k ∈ {1, · · · , nj}, n1 = n2 = 6.

Yi,j = 1/nj

∑nj

k=1 Xi,j,k, j ∈ {1, 2}.

Let µi = µ2,i − µ1,i be the mean log2 (PM) difference in controland IP-enriched DNA hybridizations for probe i.

For each probe i ∈ {1, · · · , 1.1 million}, we have:

H0,i : µi = 0,

H1,i : µi > 0.


Multiple testing procedures for identifying boundprobes: blips

Two-sample Welch t-statistic:

Ti,n =Yi,2 − Yi,1√

σ2i,1/n1 + σ2

i,2/n2

To take into account the blip structure, consider the following scantest statistics:

T ∗i,n =

1w

i+w−1∑h=i

Th,n, i = {1, · · · , N − w + 1}

where Th,n is the two-sample Welch t-statistic for probe h.

=⇒ Aims to borrow strength across a blip of size w when testingthe null hypothesis for a given probe: rejections become easier inthe vicinity of bound regions and harder around unbound regions.


Type I error rates

Vn: number of falsely rejected hypotheses.

Rn: Total number of rejected hypotheses.

• Family-wise error rate (FWER): Probability of at least one falserejection,

FWER ≡ Pr(Vn ≥ 1).

• Tail probability for the proportion of false positives (TPPFP):Probability that the proportion Vn/Rn of false positives among therejected hypotheses exceeds a user supplied value q,

TPPFP ≡ Pr(Vn/Rn > q), q ∈ (0, 1).

• False discovery rate (FDR): Expected value of the proportionVn/Rn of false positives among the rejected hypotheses,

FDR ≡ E[Vn/Rn], where Vn/Rn ≡ 0, if Rn = 0.


Controlling the FWER: Bonferroni adjustment

Assumptions: Under the null hypothesis,

• The test statistics have the same marginal null distribution.

• Xi,j,k ∼ N (0, σ2j ), j = 1, 2.

FWER:

PQ0

(max

i∈{1,··· ,N−w+1}T ∗

i,n > c

)≤ α,

where α is the nominal Type I error rate , and c is an unknowncommon cut-off.

Bonferroni adjustment: Let G0 represent the null distribution of thescan test statistics, i.e., null distribution of the r.v.T ∗ = 1/w

∑wh=1 Th. The Bonferroni adjusted cut-off is given by

cB = G−10 (1− α/(N − w + 1)) .


Controlling the FWER: Nested-Bonferroniadjustment

• The nested-Bonferroni adjustment is given by

cNB = F−10 (1− α/K),

where F0 is the null distribution of the test statisticsZ = maxi∈{1,··· ,w} T ∗

i and

K =⌈

N − w + 1w

⌉.

• Nested-Bonferroni adjustment is less conservative than theBonferroni adjustment: cNB ≤ cB .

• Corresponding null distributions can be estimated by parametricbootstrap (using the normality assumption for control andtreatment groups under the null hypothesis and simulating thecorresponding random variables). For the Bonferroni adjustment, anormal approximation is also possible.


Procedures for controlling different Type I errorrates

• For control of the FWER: B-FWER, NB-FWER

Bonferroni Nested Bonferroni

Null dist G0: c.d.f. of the r.v. F0: c.d.f. of the r.v.

T∗ = (1/w)∑w

h=1 Th Z = maxh∈{1,··· ,w} T∗h

cut-off c G−10 (1− α/(N − w + 1)) F−1

0 (1− α/K)

where K =⌈

N−w+1w

⌉Estimation of Parametric bootstrap or Parametric bootstrap

the null dist Normal approximation

They are equivalent when w = 1.

• For control of the TPPFP: Augmentation procedure of van der Laan et al.

(2004). VDP-TPPFP

• For control of the FDR: Benjamini and Hochberg (1995). BH-FDR


Simulation studies

• ∼ N probes with n1 = 6 control and n2 = 6 treatment observations.

• Non-blip and blip data are generated from distributions N (µ0, σ0) and

N (µ1, σ1), respectively.

N w # blips (µ0, σ0) (µ1, σ1)

0 2000 10 12 (0,1) (2,0.75)

I 2000 10 12 (0,1) (2,0.75)

II 2000 10 12 (0,1) (1.5,1)

III 2000 ∼ Uniform[5, 16] 12 (0,1) (1.5,1)

IV 3000 ∼ Truncated gamma(10, 1) 20 (0,1) (1.5,1)

Table 1: Summary of the simulation settings.

• Estimation of the null distribution of the test statistics is based on

B = 100, 000 observations.


Simulation 0: Comparison of the actual Type Ierror rates

w Method NB-FWER B-FWER VDP-TPPFP BH-FDR

1 B 0.042 0.042 0.042 0.0440

N 0.042 0.0451

2 B 0.032 0.028 0.002 0.0476

N 0.326 0.0719

5 B 0.05 0.036 0.00 0.0459

N 0.124 0.0559

10 B 0.04 0.024 0.002 0.0449

N 0.054 0.0498

20 B 0.034 0.014 0.004 0.0415

N 0.026 0.0449

Table 2: B: Bootstrap, N: Normal approximation, α = 0.05.


Simulation 0: w = 10

NB

−F

WE

R

B−

FW

ER

VD

P−

TP

PF

P

BH

−F

DR

120

140

160

180

200nu

mbe

r of

rej

ectio

ns

NB

−F

WE

R

B−

FW

ER

VD

P−

TP

PF

P

BH

−F

DR

120

130

140

150

160

170

180

num

ber

of c

orre

ct r

ejec

tions

Figure 4: Boxplot of the number of rejections and number of correctrejections with a blip size of w = 10 for NB-FWER, B-FWER, VDP-TPPFP, BH-FDR.


Summary of the simulations I, II, III, IV

0.5 0.6 0.7 0.8 0.9 1.0

0.70

0.85

1.00

Simulation I

sensitivity

spec

ifici

ty

0.2 0.4 0.6 0.8 1.0

0.75

0.85

0.95

Simulation II

sensitivity

spec

ifici

ty

0.2 0.4 0.6 0.8 1.0

0.75

0.85

0.95

Simulation III

sensitivity

spec

ifici

ty0.2 0.4 0.6 0.8 1.0

0.80

0.90

1.00

Simulation IV

sensitivity

spec

ifici

ty

Figure 5: Simulations I, II, II and IV. Specificity versus sensitivity plots.

©: NB-FWER, 4: VDP-TPPFP, +: BH-FDR. Different colors represent

different assumed blip sizes: w = 1 , w = 2, w = 5, w = 10, and w = 20.


Determining the blip size

• Considered multiple testing procedures are indexed by theparameter w, i.e., the blip size.

Probe

25bp 10bp

~1kb

Probe

35bp

• Theoretical calculation for the blip size: 25w + 10(w − 1) = 1000=⇒ w ≈ 30 probes.

• Empirical plots of the data suggest a smaller blip size: w ≈ 10probes.

• A model selection framework for selecting the blip size.


Determining the blip size: Piecewise constantmean regression model for the intensity signal

• Let (Yi, Li), i = {1, · · · , N} represent the data on N probes. Yi isthe two-sample Welch t-statistic and Li is the genomic location forprobe i, respectively.

• Recall that we have two groups of interest: bound and unboundclasses.

• AssumeE[Yi] = I(Li /∈ A)µ0 + I(Li ∈ A)µ1,

where A represents the group of bound probes.

• Estimation: Given the blip start sites, µ0 and µ1 can beestimated by ordinary least squares. Use a forward stepwisealgorithm to estimate the blip start sites.

• How many blips for a given w?


Monte-Carlo cross-validation

• One observation for each probe, i.e., one realization of the teststatistics, Yi ≡ Ti,n.

B1

B1H1 B1H2 B1H3

B2

B2H1 B2H2 B2H3

Figure 6: Probe level data: B1: IP replicate 1, B2: IP replicate 2,and Hk represents the k-th hybridization replicate.

Training sample: 4 hybridizations from B1 and B2, respectively.

Validation sample: 2 hybridizations from each of B1 and B2.

9 different ways to divide up the data in this manner.


Cross-validated risk over 500 blips on chip A

0 100 200 300 400 500

150.

1515

0.16

150.

1715

0.18

150.

1915

0.20

number of blips

cros

s−va

lidat

ed r

isk

0 5 10 15 20 25 3015

0.18

015

0.18

515

0.19

015

0.19

515

0.20

015

0.20

5

number of blips

cros

s−va

lidat

ed r

isk

w=1w=2w=10w=20w=30

Figure 7: Left panel: Cross-validated risk over 500 blips with fivedifferent blip sizes, w ∈ {1, 2, 10, 20, 30}. Right panel: Zooming intothe first 30 blips.


0 10 30

−10

010

blip−1

loc

t−st

at0 10 30

−5

5

blip−2

loc

t−st

at

0 10 30

−5

5

blip−3

loc

t−st

at

0 10 30

−5

515

blip−4

loc

t−st

at

0 10 30

−5

05

blip−5

loc

t−st

at

0 10 30

020

blip−6

loc

t−st

at

0 10 30

020

blip−7

loc

t−st

at

0 10 30

−5

515

blip−8

loc

t−st

at

0 10 30

−5

515

blip−9

loc

t−st

at

0 10 30

−10

1030

blip−10

loc

t−st

at

0 10 30

−5

5

blip−11

loc

t−st

at

0 10 30

−5

05

10

blip−12

loc

t−st

at

0 10 30

−5

515

blip−13

loc

t−st

at

0 10 30

−10

010

20

blip−14

loc

t−st

at

0 10 30

−5

515

blip−15

loc

t−st

at

0 10 30

−5

5

blip−16

loc

t−st

at

0 10 30

−5

515

blip−17

loc

t−st

at

0 10 30

−5

05

blip−18

loc

t−st

at

0 10 30

−10

1030

blip−19

loc

t−st

at

0 10 30

−5

515

blip−20

loc

t−st

at

0 10 30

−5

515

blip−21

loc

t−st

at

0 10 30

−10

10

blip−22

loc

t−st

at

Figure 8: p53 ChIP-Chip data. Blips identified on chip A using NB-FWER multiple testing procedure with an assumed blip size of w = 2.The 28 blips displayed are identified by controlling the FWER usingthe NB-FWER procedure at the nominal level α = 0.05 .


Control of the FWER for chip A

w = 1 w = 2 w = 10 w = 20 w = 30

#blips identified 28 22 14 10 8

# real blips 8 10 13 10 8

Table 3: Multiple testing procedures applied to Chip A. Number ofreal blips identified by visual inspection. A real blip refers to a smallcluster of probes (> 1 probes) that has test statistics greater thanits surroundings.


Results on p53 (α = 0.05, q = 0.05)

Annotation NB-FWER VDP-TPPFP BH-FDR

1kb 5’ UTR 6 6 21

3kb 5’ UTR 14 14 47

1kb CpG 17 22 86

3kb CpG 39 45 162

Within a gene 87 93 231

Within an exon 1 1 15

Total 254 269 719

Table 4: Annotation of the chromosomal regions identified by themultiple testing procedures. 12 of the 15 additional blips identifiedby VDP-TPPFP fall into potential regulatory regions.


Results on p53 (α = 0.05, q = 0.05)

w = 1

1kb of 5’ 3kb of 5’ 1kb of CpG 3kb of CpG WCR WE Total

NB-FWER 1 3 6 13 37 6 128

VDP-TPPFP 1 3 6 13 39 7 134

BH-FDR 14 29 31 75 195 18 553

w = 10


NB-FWER 6 14 17 39 87 1 254

VDP-TPPFP 6 14 22 45 93 1 269

BH-FDR 21 47 86 162 231 15 719

w = 20


NB-FWER 5 11 13 27 55 2 188

VDP-TPPFP 6 11 13 28 60 2 208

BH-FDR 9 23 32 68 112 4 355

w = 30


NB-FWER 2 4 7 23 33 0 145

VDP-TPPFP 2 4 7 23 34 0 149

BH-FDR 3 7 15 38 63 1 225


Results on p53 (α = 0.05, q = 0.05)

• Cawley et al. (2004) identified 48 potential p53 binding regionsand verified 14 of these using RT-PCR. 23 of our 221 blipsoverlap with these.

• Our blips include 13 of these experimentally verified regionsand 49 additional blips that show at least as high hybridizationsignal as this verified group.

• Among these 48, only 1 contains an exact copy of the p53consensus binding sequence and none of the verified 14 haveconsensus matching sequences.

• Among our 221 blips, 4 of them have an exact copy of the p53consensus sequence.


Results on p53

Annotation Our 221 blips 48 blips by

Cawley et al. (2004)

1kb 5’ UTR # blips 5 0

% blips 2 0

1kb CpG # blips 17 8

% blips 8 17

p53 consensus # blips 4 1

sequence % blips 2 2

Within an orf # blips 81

% blips 37 ≤ 36∗

∗: Average over 3 transcription factors and includes 5kbdownstream of the 3’ terminal exon.


p53 consensus binding sequence

• Consists of the following arrangement of the consensus DNAsequence RRRCW (.) and its reverse complement WGYYY (/):

RRRCWWGYYY[0-15]RRRCWWGYYY

./− ./,spacer − ∈ [0, 15].

• Wang et al. (1995) showed that the tetrameric p53 protein canbind to various arrangements of multiple copies of the consensusRRRCW.

• Inga et al. (2002) showed that sites as many as 4bp mismatchesto the 20mer consensus could be functional and enable high levelsof transactivation.


Enrichment for p53 consensus binding sequence

verified filtered all

./− ., ./− /, .− ./, /− ./ 7/13 21/49 86/221

./ 8/13 33/49 118/221

./− ./ with at most 2 missmatches 7/13 35/49 141/221

Table 5: Occurrences of various arrangements of the 5mer RRRCW

among the 13 experimentally verified blips of Cawley et al. (2004)),our 49 filtered blips that show higher hybridization signal than theexperimentally verified blips, and all of our 221 blips.


Summary

• The scan statistic allows incorporation of the spatial datastructure into multiple testing procedures.

• Identified blips show enrichment in terms of variousarrangements of the p53 partial consensus sequence RRRCW aswell as enrichment for potential promoter regions.

• Monte-carlo cross-validation in a piecewise constant regressionmodel provides a guide for choosing the appropriate blip size.

• More ChIP-Chip data will be becoming available as a part ofthe ENCODE project.


Some other issues related to ChIP-Chip data

• Type of controls: Whole cell extract versus mock IPexperiments.

• Size and spacing of the arrayed elements: design of the arraysfor IP-enriched DNA hybridization.

• Detailed characterization of the spatial structure: fragmentlength distribution as a result of sonication.


References

• S. E. Cawley et al. (2004). Unbiased mapping of transcriptionfactor binding sites along human chromosomes 21 and 22 pointsto widespread regulation of noncoding RNAs. Cell 116: 499-509.

• S. Keles, M. J. van der Laan, S. Dudoit, and S. E. Cawley(2004). Multiple Testing Methods for ChIP-Chip High DensityOligonucleotide Array Data.http://www.bepress.com/ucbbiostat/paper147/

• M.J. Buck, J.D. Lieb (2004). ChIP-Chip: considerations for thedesign, analysis, and application of genome-wide chromatinimmunoprecipitation experiments. Genomics 83(3): 349-60.


http://www.bepress.com/ucbbiostat/paper147/

EXTRA SLIDES


Results on p53 (α = 0.05, q = 0.05)

NB−FWER VDP−TPPFP BH−FDR

%0

2040

6080

100

1kb 5’ UTR3kb 5’ UTR1kb CpG3kb CpGWithin a geneWithin an exon

Figure 9: Annotation of the chromosomal regions identified by themultiple testing procedures.


Results on p53 (α = 0.05, q = 0.05)

254 211 179 49

1kb 5’ UTR 6 6 5 0

3kb 5’ UTR 14 12 9 2

1kb 3’ UTR 2 1 1 1

3kb 3’ UTR 8 6 4 2

1kb CpG 17 13 13 2

3kb CpG 39 30 28 4

Within a gene 87 71 66 10

Within an exon 1 1 1 0

Table 6: Annotation of the post-processed chromosomal regions iden-tified by the NB-FWER procedure.


Multiple Testing Methods For ChIP–Chip High Density Oligonucleotide Array Data

Documents