Comparative genomics, ChIP- chip and transfections to find cis-regulatory modules Penn State University, Center for Comparative Genomics and Bioinformatics: Webb Miller, Francesca Chiaromonte, Ross Hardison Children’s Hospital of Philadelphia: Mitch Weiss, Lou Dore NimbleGen: Roland Green, Xinmin Zhang Cold Spring Harbor, March 2007 What is conservation good for??
30
Embed
Comparative genomics, ChIP-chip and transfections to find cis-regulatory modules Penn State University, Center for Comparative Genomics and Bioinformatics:
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Comparative genomics, ChIP-chip and transfections to find
cis-regulatory modules
Penn State University, Center for Comparative Genomics and Bioinformatics: Webb Miller, Francesca Chiaromonte, Ross Hardison
Children’s Hospital of Philadelphia: Mitch Weiss, Lou Dore
NimbleGen: Roland Green, Xinmin ZhangCold Spring Harbor, March 2007
What is conservation good for??
Ideal cases for interpretation by comparative genomics
Neutral DNASimilarity
Human vs mouse
Position along chromosome
DNA segments with a function common to divergent species.
DNA segments in which change is beneficial to at least one of the two species.
Negative selection(purifying)
P (not neutral)Neutral DNA
Similarity
Positive selection(adaptive)
Neutral DNA
Human vs rhesus
Putative transcriptional regulatory regions = pTRRs
• Antibodies vs 10 sequence-specific factors: – Sp1, Sp3, E2F1, E2F4, cMyc, STAT1, cJun, CEBPe, PU1, RA Receptor A
– High resolution ChIP-chip platforms: Affymetrix and NimbleGen
– Data from several different labs in ENCODE consortium
• High likelihood hits for ChIP-chip– 5% false discovery rate
• Supported by chromatin modification data– Modified histones in chromatin: H4Ac, H3Ac, H3K4me, H3K4me2, H3K4me3, etc.
– DNase hypersensitive sites (DHSs) or nucleosome depleted sites
• Result: set of 1369 pTRRs
Functional classes show distinctive trends in phylogenetic depth of
conservation
Genes likely regulated by clade-specific pTRRs are enriched for
distinctive functions
310
450
91
173
Millions ofyears
Percentage of pTRRs that align no further than:Primates: 3%
Eutherians: 71%
Marsupials: 21%
Tetrapods: 4%
Vertebrates: 1%
David King
Enriched GO categories
q-value for FDR
Immune response
Protease inhibition
Mitosis and cell cycleTranscriptional regulation
0.0006
0.0005
0.0005
0.004
0.012Ion transport
Regulatory potential (RP) captures pattern, composition and constraint
in alignments
Genome Research 16:1585 (2006)
• High RP for an aligned sequence means it contains patterns similar to those found in gene regulatory regions– Positive training set: Alignments of known regulatory regions – Negative training set: Alignments of likely neutral DNA
(ancestral repeats)• Human and mouse RP scores are on UCSC Genome Browser and
PSU’s Galaxy
High RP plus conserved consensus motif is a good predictor of CRMs around
GATA-1 regulated genes
Genome Research 16:1480 (2006)
Genes Co-expressed in Late Erythroid Maturation
G1E cells: proerythroblast line lacking the transcription factor GATA-1. G1E-ER cells: rescued by expressing an estrogen-responsive form of GATA-1Rylski et al., Mol Cell Biol. 2003
Predict CRMs based on alignment and expression of
nearby genes• Gene is up- or down-regulated by GATA-1• Noncoding DNA sequence • Aligns between mouse and other mammals and has a positive RP score
• Contains a conserved consensus binding site motif for GATA-1
preCRMs with conserved consensus GATA-1 BS tend to be active on transfected
plasmids
DNA segments with positive RP and a GATA-1 binding motif validate as enhancers at a
Test of neutrality using polymorphism and divergence data
A promoter distal to the beta-like globin genes has a signal for recent
purifying selection
The distal promoter is close to the locus control region for beta-globin
genes
Evolutionary approaches to predicting and analyzing regulatory
regions• Sequence comparison alone will not detect all regulatory
regions– Need comprehensive protein-binding data
• Comparative genomics can help interpret the binding data– Aspects of regulation of some functional groups are clade-
specific– Depth of conservation may correlate with certain types of
function• Strong constraint on basal mechanisms?• Lineage-specific “fine tuning”?
• A majority of sites occupied by GATA-1 in G1E-ER cells have some function other than enhancement (by our assays)
• Incorporation of pattern and composition information along with with conservation can lead to effective discrimination of functional classes (regulatory potential).
Many thanks …
B:Yong Cheng, Ross, Yuepin Zhou, David KingF:Ying Zhang, Joel Martin, Christine Dorman, Hao Wang
PSU Database crew: Belinda Giardine, Cathy Riemer, Yi Zhang, Anton Nekrutenko
Alignments, chains, nets, browsers, ideas, …Webb Miller, Jim Kent, David Haussler
RP scores and other bioinformatic input:Francesca Chiaromonte, James Taylor, Shan Yang, Diana Kolbe, Laura Elnitski
Funding from NIDDK, NHGRI, Huck Institutes of Life Sciences at PSU
Categories of Tested DNA Segments
Regulatory potential (RP) to distinguish functional classes
Examples of validated preCRMs
ChIP-chip hits for GATA-1 occupancy
Mpeak TAMALPAIS
275 hits in both 276 hits in both216 6059
321 total ChIP-chip hits
Technical replicates of ChIP-chip with antibody against GATA1-ER
19 ChIP-chip hits were tested by qPCR:13 were validated: ~70%