Lecture 5. Topics in Gene Regulation and Epigenomics (Prediction of Enhancers and Enhancer Targets) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology
Jan 04, 2016
Lecture 5. Topics in Gene Regulation and Epigenomics (Prediction of Enhancers and Enhancer Targets)
The Chinese University of Hong KongCSCI5050 Bioinformatics and Computational Biology
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 2
Lecture outline1. Cis-regulatory modules and enhancers2. Prediction of enhancers
– General– Context-specific
3. Prediction of enhancer targets4. Experimental validations
Last update: 3-Oct-2015
CIS-REGULATORY MODULES AND ENHANCERS
Part 1
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 4
Regulatory sequence elements• DNA sequence elements playing regulatory
roles in transcription by interacting with DNA-binding proteins– Promoters: Initiating transcription– Enhancers: Enhancing transcription
• Locus control regions (LCRs): Enhancing a set of linked genes
– Silencers: Repressing transcription– Insulators: Setting gene boundaries, blocking
promoter-enhancer interactions
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 5
The binding proteins• There are different types of DNA binding
proteins– By specificity: sequence-specific vs. non-specific– By function: transcriptional regulation, DNA
cleavage, DNA modification, etc.• The regulatory elements are bound by
proteins generally called transcription factors (TFs)– The binding sites of the TFs are called transcription
factor binding sites (TFBSs)
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 6
The binding proteins• More specific (, a bit confusing) names of particular
subtypes of TFs:– Enhancers are bound by activators– Silencers are bound by repressors– Insulators are commonly bound by a protein called CTCF (CCCTC-
Binding Factor)– (Promoters are bound by transcription factors and RNA
polymerase, but the polymerase itself is not counted as a transcription factor)
• In many cases, a TF is a protein complex– At least one subunit of the complex contains a DNA-binding
domain– Other subunits do not directly bind DNA (and are called co-
activators, for example)
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 7
Recognition of sequence elements• How does a TF decide where to bind?
– Where DNA is accessible– Where there are special signals on the DNA (e.g.,
lack of methylation) and the surrounding proteins (e.g., histone modifications)
– Where the DNA structure is suitable• Minor groove shape, propeller twist, etc.
– Where the DNA sequence is suitable• Motifs that are usually short (e.g., 6-10bp)
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 8
Recognition of sequence elements
Last update: 3-Oct-2015
Image credit: Papavassiliou, Molecular medicine Today 4(8):358-366, (1998)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 9
Effects of regulatory elements
Last update: 3-Oct-2015
Image credit: Sholtis and Noonan, Trends in Genetics 26(3):110-118, (2010)
No TF binding: only basal expression of gene A
TF1 binding enhancer (at limbs): elevated expression of gene A
TF2 binding enhancer (at brain): elevated expression of gene ATF3 binding silencer: expression of gene A inhibited
CTCF not binding insulator: binding of TF1 at enhancer can affect both gene A and gene B
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 10
Locations of regulatory elements
Last update: 3-Oct-2015
Image credit: Maston et al., Annual Review of Genomics and Human Genetics 7:29-59, (2006)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 11
Enhancers• Enhancers have been a major research focus in the
past few years due to various reasons:– Very incomplete catalog, difficult to locate them
• Can be upstream or downstream of target gene• Can be relatively close (within kilobases) or far away (MBs)
from target gene, estimated average: ~100kb
– Context-specific– Availability of large-scale experimental methods and data
• Identification of enhancers• Validation of enhancers
– Annotation of disease-associated non-coding variants– Discovery of super enhancers
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 12
Cis-regulatory modules• A cis-regulatory module (CRM) is a module of
multiple sequence elements that regulate the expression of genes nearby (“in cis”)
• It is sometimes used a synonym as enhancer• However, based on the precise definitions:
– CRMs can also include other cis-acting regulatory elements
– Enhancers may not always function in cis (some enhancers regulate very distal genes in trans)
– An enhancers does not necessarily constitutes a module of multiple TF binding sites
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 13
Function of enhancers• Enhancer-promoter
looping1. DNA2. Enhancer3. Promoter4. Gene5. Transcriptional
activator/co-activator6. Mediator7. RNA polymerase
Last update: 3-Oct-2015
Image credit: Jon Cheff (Wikipedia)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 14
More about DNA looping
Last update: 3-Oct-2015
(a) Intragenic loops joining the 5 and ′3 end of genes may allow recycling ′of RNA Pol II and facilitate maintenance of transcriptional directionality. (b) Enhancer-promoter loops—mediated by sequence-specific transcription factors, and possibly assisted by noncoding RNAs or by general DNA binding factors such as CTCF and cohesin—lead to transcriptional activation. (c) Loops between Polycomb-bound regions (PREs) and promoters prevent RNA Pol II recruitment and/or impair transcriptional elongation of promoter-bound RNA polymerases. (d) Insulator-mediated loops may segregate individual loci containing the coding part of the gene and its regulatory regions from the surrounding genome landscape with other regulatory elements.
Image credit: Cavalli and Misteli, Nature Structural and Molecular Biology 20(3):290-299, (2013)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 15
More about DNA looping• “Loop gene” vs. “anchor gene”: Up-regulation of
anchor genes > loop genes > non-interacting genes
Last update: 3-Oct-2015
Image credit: Fullwood et al. Nature 462(7269):57-64, (2009)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 16
Signatures of enhancers• General for active/functional DNA:
– Evolutionary conservation– Open chromatin
• General for regulatory elements:– Containing TFBS
• Cluster of TFBS
• Specific to enhancers:– P300 binding– H3K4me1, H3K4me2, H3K27ac
• Signals and signal patterns
– Enhancer RNA• Lack of inactive marks and marks of other types of
regulatory elementsLast update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 17
Context specificity• An enhancer
can be active in some contexts (cell type, tissue type, etc.) and inactive/posed in some other contexts
Last update: 3-Oct-2015
Image credit: Shlyueva et al. Nature Reviews Genetics 15(4):272-286, (2014)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 18
Computational problems1. Prediction of enhancers
a) “General”, i.e., genomic regions that are enhancers in some contexts
b) Context-specific, i.e., enhancers that are active in a given context
2. Prediction of target genes of enhancersa) “General”b) Context-specific
Last update: 3-Oct-2015
PREDICTION OF GENERAL ENHANCERS
Part 2a
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 20
Static features• In the past, static features are used to predict
general enhancers, i.e., features that remain the same across different contexts– Presence/density of TFBS
• Based on sequence motifs
– Evolutionary conservation• Based on multiple sequence alignment
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 21
Learning approach• Supervised: Learn features based on known
examples– Few good examples (more on this later)
• Unsupervised: Threshold features– Mostly arbitrarily
• Semi-supervised: Combine information from known examples and distribution of genomic regions in the feature space
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 22
Overview of strategies
Last update: 3-Oct-2015
Image credit: Su et al. PLOS Computational Biology 6(12):e1001020, (2010)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 23
Classification of methods
Last update: 3-Oct-2015
Image credit: Su et al. PLOS Computational Biology 6(12):e1001020, (2010)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 24
TFBS cluster methods• Choice of TFs:
– All TFs with known motifs– TFs known to bind enhancers (incomplete
knowledge)• Definition of clusters:
– High density of TFBS, as compared to background– Occurrence of binding sites of the same set of TFs
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 25
Finding TF binding motifs• Alignment of promoter sequences
Last update: 3-Oct-2015
Image credit: D'haeseleer et al. Nature Biotechnology 24(8):959-961, (2006)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 26
Finding TF binding motifs• High-throughput SELEX
(systematic evolution of ligands by exponential enrichment ): Testing the binding of TF protein/binding domain with random nucleotide sequences
Last update: 3-Oct-2015
Image credit: Jolma et al. Cell 152(1-2):327-339, (2013)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 27
Sequence conservation methods• Non-coding elements with “extreme”
conservation: Genomic regions with– High human–pufferfish-Takifugu (Fugu)-rubripes
conservation, or ultra-high human–mouse–rat conservation
– High sequence match score– No sign of transcription or protein-coding
• Results:– Among 167 predictions tested in a mouse assay,
45% reproducibly show enhancer activities
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 28
Sequence conservation methods
Last update: 3-Oct-2015
Image credit: Pennacchio et al., Nature 444(7118):499-502, (2006)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 29
TFBS cluster conservation methods• Conservation of TFBS clustering (distance),
affinity and conservation
Last update: 3-Oct-2015
(A) EEL scoring function. Top: schematic representation of two TFs (blue and red ovals) bound to DNA of unequal length from two different species. Side view (top left) indicates mean distance (View the MathML sourcex¯) and difference in distance (Δx), and front view (top right) indicates difference in angle (Δϕ) of the two factors bound to DNA (open circle). Position weight matrix scores for TFs were used as a proxy for binding affinity in calculation of ΔGT, the sum of TF affinities to sites in both species. Bottom: the score function. See Supplemental Data for details.
(B) EEL analysis (left) using the five known TFs that regulate eve (Hunchback, Caudal, Knirps, Bicoid, and Kruppel) identifies all four enhancers driving striped expression of Drosophila eve (right). Blue diagonal lines indicate aligned regions, and black lines on the x and y axes represent the conserved TF binding sites that constitute the cis-modules (CM). Number after the CM indicates its rank based on its EEL score.
(C) Text display of EEL alignment of part of the eve Stripe 3/7 enhancer (CM1 from [B]). D. pseudoobscura and D. melanogaster sequences are on top and bottom lines, respectively. EEL aligns the DNA sequences between the conserved TF sites for clarity; the DNA alignment does not contribute to the EEL score. Yellow boxes indicate conserved binding sites of Hunchback (Hb) or Knirps (Kni), which regulate this cis-module ( Small et al., 1996).
(D) A distal −20 kb enhancer element in the mouse and human MyoD genes is identified by EEL analysis.Image credit: Hallikas et al., Cell 124(1):47-59, (2006)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 30
Comparison of methods• REDfly validated regulatory modules against
short exons and introns
Last update: 3-Oct-2015
Image credit: Su et al. PLOS Computational Biology 6(12):e1001020, (2010)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 31
Combination of methods• Performance change of method pairs
Last update: 3-Oct-2015
Image credit: Su et al. PLOS Computational Biology 6(12):e1001020, (2010)
PREDICTION OF CONTEXT-SPECIFIC ENHANCERS
Part 2b
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 33
Context-specific enhancer activities• The above prediction methods can only
predict whether a genomic region is an enhancer in some context, but not its actual activity in a context
• High-throughput sequencing made it possible to obtain a lot of useful context-specific data– Protein binding– Chromatin accessibility– Histone modifications– Enhancer RNA
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 34
Chromatin accessibility• Analysis of a known
enhancer (HS2 of the -globin LCR) and other DNase I hypersensitive sites (DHSs) with similar patterns across cell types– 14 of the 20 displayed
enhancer activity
Last update: 3-Oct-2015
Image credit: Thurman et al. Nature 489(7414):75-82, (2012)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 35
Histone modifications• Typical signatures of predicted enhancers
Last update: 3-Oct-2015
Image credit: Heintzman et al. Nature 459(7423):108-112, (2009)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 36
Enhancer RNA• Bi-directional, non-coding transcripts around
active enhancers– May play a functional role in gene regulation
Last update: 3-Oct-2015
Image credit: Andersson et al., Nature 507(7493):455-461, (2014)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 37
Unsupervised predictions• Whole-genome
segmentation (ChromHMM) using hidden Markov models based on histone marks– Manual
interpretation of the resulting states
Last update: 3-Oct-2015
Image credit: Ernst and Kellis, Nature Methods 9(3):215-216, (2012)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 38
Unsupervised predictions• Whole-genome segmentation using Segway
– E: enhancer; GM: gene middle
Last update: 3-Oct-2015
Image credit: Hoffman et al., Nature Methods 9(5):473-476, (2012)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 39
Unsupervised predictions• Rule-based filtering
Last update: 3-Oct-2015
Human genome grch37
Divide into 100bp binsDivide into 100bp bins
Remove blacklisted regionsRemove blacklisted regions
Remove bins with K562 BAR score <= 0.9Remove bins with K562 BAR score <= 0.9
Remove bins with K562 promoter score > 0.8Remove bins with K562 promoter score > 0.8
Remove bins within +/- 2000bp from Gencode TSSRemove bins within +/- 2000bp from Gencode TSS
Remove bins that intersect Gencode exonsRemove bins that intersect Gencode exons
Remove bins with phastCons primate score < 0.1Remove bins with phastCons primate score < 0.1
Merge adjacent bins into longer intervalsMerge adjacent bins into longer intervals
30,956,951 bins
30,840,726 bins (116,225 bins filtered)
461,722 bins
412,000 bins
257,666 bins
243,951 bins
97,193 bins
55,857 intervals
0
1
2
3a
3b
4
5
6
Remove intervals with no binding motifs of expressed (IDR < 0.05) TFsRemove intervals with no binding motifs of expressed (IDR < 0.05) TFs7
59,425 intervals
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 40
Supervised predictions• If we want to train a supervised model, we need
positive and negative examples• Difficulties:
– Even one of the biggest databases of validated enhancers, VISTA contains only 1203 positive examples and 1076 negative examples as of Mar 2015
– We do not have complete knowledge of the context-specific activities of these examples
• Positive: We do not whether in which contexts they are positive• Negative: We do not whether they are always negative
– They are biased – Usually the most confident former predictions with high sequence conservation
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 41
Constructing positive examples• Use some features to define positive examples• These features should not be used in the
prediction process (except in some special settings), otherwise– They will dominate the resulting models– Prediction accuracy cannot be evaluated
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 42
Constructing negative examples• Sampling genomic regions likely to be negatives
– Random regions• Too negative, decision boundary can be fuzzy
– Other types of sequence element• Resulting model is for distinguishing between these element types
rather than identifying enhancers• Hard to ensure mutual exclusiveness
– Enhancers known to be active only in other contexts• Hard to obtain
– Combination of the above• In general, it is always a tough decision as to what
properties of the positive examples the negative examples should match
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 43
Supervised predictions• RFECS (Random Forest based Enhancer identification
from Chromatin States)– Positives: gene-distal P300 binding sites overlapping DHSs– Negatives:
• TSSs overlapping DHS• Random regions distal from P300 binding sites or TSSs
– Features:• 24 histone marks• For each histone mark, average signal of 20 bins around the
target region (to capture signal pattern)
– Machine learning model:• Random Forest
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 44
Supervised predictions• RFECS: Patterns around P300 binding sites
Last update: 3-Oct-2015
Image credit: Rajagopal et al., PLOS Computational Biology 9(3):e1002968, (2013)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 45
Supervised predictions• RFECS:
Prediction accuracy
Last update: 3-Oct-2015
Image credit: Rajagopal et al., PLOS Computational Biology 9(3):e1002968, (2013)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 46
Supervised predictions• RFECS:
Feature importance and co-occurrence
Last update: 3-Oct-2015
Image credit: Rajagopal et al., PLOS Computational Biology 9(3):e1002968, (2013)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 47
Semi-supervised predictions• Several main approaches
– Mainly unsupervised, but using known examples to bias the clustering process (e.g., requesting certain regions must receive the same state)
– Mainly supervised, but using global distribution of regions in the feature space to discover sub-classes
– Optimizing a function that includes:• Prediction accuracy of known examples• Likelihood/posterior probability of data• Model complexity
Last update: 3-Oct-2015
PREDICTION OF ENHANCER TARGETS
Part 3
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 49
Enhancer targets• In theory, enhancers can be upstream or
downstream of, and either near or far away from their targets
• Features useful for identifying enhancer targets:– Distance
• Closest gene(s)
– Activity correlations– Co-conservation/co-evolution
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 50
Distance• It is true that
– Enhancers can be far away from their targets– The gene closest to an enhancer may not be its
target• However,
– In general the closer an enhancer is from a gene in the DNA sequence, the higher chance that the gene is the target of the enhancer
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 51
Activity correlation• Main idea:
– If an enhancer regulates a gene, the activity of the enhancer should be correlated with the expression of the gene
• Using this idea:1. Compute correlation for all enhancer-target pairs within a
certain maximum distance2. Return the ones with significant correlations
• Issues:– Quantification of enhancer activity– Multiple hypothesis testing– An enhancer may only regulate a gene in some contexts– A gene may have more than one regulating enhancer– Cannot identify context-specific regulation
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 52
Co-conservation/co-evolution• Main idea: If an enhancer regulates a gene,
they will– Co-occur in genomes– Mutate together
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 53
Identifying enhancer targets
Last update: 3-Oct-2015
Image credit: He et al., PNAS 111(21):E2191-E2199, (2014)
EXPERIMENTAL VALIDATIONSPart 4
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 55
Validating enhancers• Reporter assay
– Put a construct with an enhancer candidate, a reporter gene, and a weak promoter
– If the enhancer is active, the reporter gene will be transcribed
• Limitations: Not the natural context– Distance– Chromatin state– Presence of relevant TFs
Last update: 3-Oct-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 56
Reporter assay
Last update: 3-Oct-2015
ID Enhancer activity Tissues with patternsENH_DISCR_2 Positive Tectum, FinENH_DISCR_38 NegativeENH_DISCR_16 Negative Not consistentENH_DISCR_18 Positive TelencephalonENH_DISCR_37 Positive EpidermisENH_DISCR_14 Negative Not consistentENH_DISCR_19 Negative Not consistentENH_DISCR_34 Positive EpidermisENH_DISCR_41 Positive Blood_heartENH_DISCR_44 Weak BloodENH_DISCR_24 Negative Not consistentENH_DISCR_1 Negative Not consistent/heartENH_DISCR_17 Positive TelencephalonENH_DISCR_32 Positive TelencephalonENH_DISCR_47 Positive EpidermisENH_DISCR_35 Positive Blood, earENH_DISCR_45 Weak EpidermisENH_DISCR_21 Positive Epidermis, lateENH_DISCR_12 NegativeENH_DISCR_13 Positive TelencephalonENH_DISCR_22 Weak Epidermis_bloodENH_DISCR_26 Negative Not consistentENH_DISCR_31 Weak BloodENH_DISCR_40 Positive BloodENH_DISCR_48 Positive Epidermis, bloodENH_DISCR_25 Positive TectumENH_DISCR_29 Positive Telencephalon
Image credit: The ENCODE Project Consortium, Nature 489(7414):57-74, (2012)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 57
Massively parallel reporter assay• Investigating the
impact of mutations to enhancers
Last update: 3-Oct-2015
Image credit: Melnikov et al., Nature Biotechnology 30(3):271-277, (2012)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 58
STARR-seq• Including candidate enhancer downstream of
reporter gene, so that it becomes part of the transcript and its activity can be determined by RNA-seq
Last update: 3-Oct-2015
Image credit: Arnold et al., Science 339(6123):1074-1077, (2013)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 59
Enhancer knock-out• Current technology (e.g., CRISPR) allows for precise deletion
of an enhancer candidate. Effect on gene expression can then be determined
Last update: 3-Oct-2015
Image credit: Hsu et al., Cell 157(6):1262-1278, (2014)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 60
DNA long-range interactions• Hi-C/TCC: Not specific to transcription regulation• ChIA-PET: Requires a relevant protein
Last update: 3-Oct-2015
Image credit: Zeng and Mortazavi, Nature Immunology 13(9):802-807, (2012)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 61
Summary• Enhancer is one important type of transcriptional
regulatory elements• There are many imperfect signatures of enhancers• Activities of some enhancers are context-specific• Target gene of an enhancer can be far away from
it, but not too far in general• (Qin has written a review on enhancer and
enhancer target predictions: http://www.cse.cuhk.edu.hk/~kevinyip/papers/EnhancerReview_CBBGR2015.pdf)
Last update: 3-Oct-2015