Characterising the Interactome of EZH2 in ES Cells Daire Murphy 09329510 Gerard Cagney BIOC40090 22/02/2013
Characterising the Interactome of
EZH2 in ES Cells
Daire Murphy
09329510
Gerard Cagney
BIOC40090
22/02/2013
1
Summary
The aim of this study was to characterise and develop the Ezh2 interactome using Ezh2
immunoprecipitation samples from embryonic stem (ES) cells. Ezh2 is the core catalytic
subunit of the PRC2 complex, which is associated with the maintenance of pluripotency and
self-renewal properties in stem cells through epigenetic modifications. Proper functioning of
the PRC2 complex is required for healthy stem cell differentiation and dysfunction of this
complex, whether through mutations in Ezh2 or its other subunits, has been implicated in
tumourigenesis. A mass spectrometric approach was used to identify and quantify the
complex protein mixture obtained from immunoprecipitation experiments in order to establish
an experimental interactome. In-depth analysis of the mass spectrometry results was
performed to ascertain a list of potential high-confidence interactors and use of several
bioinformatics tools provided an insight into complexity of the Ezh2 interactome and
potential new research avenues.
The experimental interactome was seemingly enriched in chromatin remodelling proteins and
transcriptional regulators, which was to be expected due to the nature of the PRC2 complex.
An enrichment in splicing factors was also observed, particularly those belonging to the
spliceosome, which provoked further analysis. An association between the spliceosome and
the PRC2 complex has yet to be characterised in the literature thus far, therefore an
experimental approach for characterising this interaction was developed to provide a platform
for further research into the topic. Advances in this field could potentially provide an insight
into how differentiation of stem cells is controlled while maintaining stemness, and how
aberrations in this process can cause the development of tumour cells.
2
Table of Contents
Summary 1
1. Introduction 3
1.1 Stem Cell Characteristics 3
1.2 Chromatin Structure and the Epigenetic code 4
1.3 EZH2 and the Polycomb Group Proteins 7
1.4 Mass Spectrometry 10.
1.5 Aim 12
2. Materials and Methods 13
2.1 Immunoprecipitation 13
2.2 Gel Electrophoresis 13.
2.3 In-gel Trypsin Digestion 15.
2.4 Peptide Extraction 19.
2.5 Mass Spectrometry 20.
2.6 Protein Identification 20
3. Results 21
4. Discussion 42.
4.1 The Current Ezh2 Interactome 42.
4.2 Large Scale Proteomic Analysis of Protein-Protein Interactions 42.
4.3 Global analysis of the Ezh2 interactome data 43.
4.4 Splicing and the PRC2 complex 44.
4.5 Conclusions 47.
5. Acknowledgements 47.
6. Bibliography 48.
7. Appendices 52.
3
1. Introduction
1.1 Stem Cells Characteristics
Embryonic stem (ES) cells, which are derived from the inner lining of mammalian
blastocysts, are important for tissue homeostasis and damage repair and can be found in most
tissues throughout the body. Stem cells are characterised by their ability to replicate while
still maintaining an undifferentiated state. Each stem cell has the potential to differentiate into
any of the three germ layers; endoderm, ectoderm, or mesoderm, which is known as
pluripotency. Maintaining pluripotency is a key process in stem cells and is regulated both in
the traditional genetic sense and in an epigenetic manner. Four key genes have been identified
that are necessary for the maintenance of pluripotency in stem cells; Oct4, Sox2, Klf4, and c-
Myc [1]
. It has also been observed that regular somatic cells can be transformed back into a
pluripotent state through activation of these genes; these cells are known as induced
pluripotent stem cells (iPSC) and have several promising applications within the field of
regenerative medicine [1]
.
Figure 1.1: Pluripotent stem cell life cycle; each stem cell has the potential to differentiation into a multitude of cell
types. (Image taken from: http://www.stemcellresearchfoundation.org/WhatsNew/Pluripotent.htm)
4
Pluripotency in stem cells is also regulated by epigenetic factors, including proteins belonging
to the Polycomb group (PcG) which act as transcriptional repressors. It has been observed that
the PcG protein complexes PRC1 and PRC2 are co-localised on many genes essential for
differentiation in both murine and human stem cells [2]
. Many of the developmental PcG
target genes have also been shown to be bound by key pluripotency transcription factors –
Oct4, Sox2, and Nanog, which suggests collaboration between the PcG proteins and the
essential pluripotency gene products identified by Tamanaka et al. [1][2]
Characterisation of the interactions between the genetic and epigenetic elements of stem cell
pluripotency maintenance holds the potential for answering key questions about cell
maturation and the downstream effects of dysfunctional regulation of pluripotency which has
been implicated in several disease states, primarily cancer [4][5][6][7]
.
1.2 Chromatin Structure and the Epigenetic Code
In order to fully understand the mechanisms by which the PcG proteins affect transcription
and their interactions with key pluripotency regulators it is important to first understand how
chromatin is ordered and how the PcG proteins are known to interact and alter this structure.
Chromatin is a combination of proteins and DNA which interact
to compact DNA into a stable and manageable structure. The
main protein components are called histones which combine to
form octameric units which the DNA wraps around to form a
nucleosome, the primary subunit of chromatin.
Figure 1.2: The multiple levels of
DNA compaction and Chromatin
structure.
5
This highly compacted structure allows huge amounts of DNA to be stored in each cell while
also protecting the DNA from damage and strengthening it for mitosis. The level of
compaction is also an important factor in regulating gene expression as compacted DNA
cannot be accessed by transcription machinery.
Figure 1.3: Nucleosome Structure. Each nucleosome consists of two H3 histones, two H4 histones, two H2A and two
H2B histones to form an octameric unit. Each histone has a tail structure which extends outwards from the core
nucleosome unit. (Taken from: ‘Epigenetics’, Allis et al. (2007))
Epigenetic modification of nucleosomes is extremely important for transcriptional regulation.
The amino-terminals, or histone tails, are the primary target for modification; they are located
outside of the main histone core and are thought to be largely unstructured due to their lack of
visibility in crystal structures [7]
. Evidence suggests that the tails are largely responsible for
the compaction of nucleosomes into chromatin [7]
; therefore modification of these tails can
alter chromatin structure. The tails and adjacent regions are also known to be target areas for
transcription machinery [7]
. These tails are rich in lysine and arginine residues which can be
modified in multiple ways including methylation, acetylation, and ubiquitination. Chromatin
can exist in two states: heterochromatin; which is condensed and transcriptionally inactive,
and euchromatin; which is open and transcriptionally active.
6
Both states are associated with specific histone-tail modifications, or histone marks; high
levels of methylation are usually associated with transcriptionally silent genes while
acetylation and phosphorylation are associated with active genes. However there are
exceptions and it seemingly comes down to a combination of the type of mark and the
position it’s placed in, as opposed to just the type of mark.
Figure 1.4: Different epigenetic marks associated with euchromatin and heterochromatin. Methylation marks are
associated with condensed, inactive chromatin while acetylation marks are associated with open, active chromatin.
(Taken from: ‘Epigenetics’, Allis et el. (2007))
The histone code hypothesis states that distinct histone modifications, on one or more tails,
act sequentially or in combination to form a code that is read by other proteins to bring about
distinct downstream events [8]
. One of the primary downstream events targeted by histone
modifications is transcription, which can be both repressed or activated by these marks. Both
repression and activation of transcription are associated with known histone marks, for
example the H3K27me3 (histone H3 lysine 27 trimethylation) mark is known to silence
transcription, while the H3K4me3 (histone H3 lysine 4 trimethylation) mark is associated
with activation of transcription. The Polycomb group and the Trithorax group (TrxG) proteins
are well known transcriptional regulators which are responsible for transcriptional silencing
and activation respectively. Both groups of proteins use histone modifications to alter the
7
structure of the chromatin and its interacting proteins and have well established links with ES
cell differentiation and pluripotency.
Interestingly both these marks, H3K27me3 and H3K4me3, can be present on the same gene,
which brings about the concept of bivalent domains. Bivalent domains contain both activating
and repressive marks, which may seem contradictory but are thought to maintain specific
genes in a poised state [9]
, initially repressing gene expression through the dominant
H3K27me3 mark while also maintaining the ability to quickly activate transcription through
the removal of the H3K27me3 mark leaving just the activating H3K4me3 mark.
The histone code also plays an important role in cellular memory outside of canonical DNA
heritability as histone modifications have been seen to be passed on through cell lines [10][11]
,
these modifications are thought to be important in cell identity as it potentially provides each
cell with a genome specific to its function.
1.3 Ezh2 and the Polycomb Group Proteins
The polycomb group proteins (PcG) are a family of proteins associated with silencing of gene
expression through chromatin remodelling. PRC1 and PRC2 are multiprotein complexes
belonging to the PcG family which work together to repress target genes. Repression of target
genes by the PRC complexes is a multistep process starting with the PRC2 complex. The
PRC2 complex trimethylates histone H3 lysine 27 (H3K27) through its catalytic EZH2
subunit, which contains a methyltransferase SET domain.
8
The trimethylation of H3K27 is associated with transcriptional silencing and it is known to
recruit the PRC1 complex which then ubiquitinylates histone H2A lysine 119 (H2AK119)
which is thought to stabilise chromatin condensation and subsequent transcriptional
repression.
Figure 1.5: Interaction of PRC1 and PRC2 to ultimately silence transcription. PRC2 binds the PRC response element
(PRE) and trimethylates H3K27 which acts as a binding platform for PRC1 which causes chromatin condensation
and transcriptional silencing. (Modified from: ‘Epigenetics’, Allis et al. (2007))
The PRC2 complex itself consists of four core components; Enhancer of Zeste homologue 2
(EZH2), Suppressor of Zeste 12 (SUZ12), Extra-embryonic endoderm (EED), and
RbAp46/48. It has been observed through multiple knock-out experiments that removal of
any of the EZH2, SUZ12, or EED subunits inactivates the PRC2 complex and disrupts ES cell
differentiation and embryonic development[12][13][14]
. As previously mentioned EZH2 is the
core catalytic subunit, generating the H3K27me3 mark via its SET methyltransferase domain.
The SUZ12 and EED subunits contain a zinc-finger DNA/RNA binding domain and a
methyl-lysine binding domain respectively, while RbAp46/48 are histone binding proteins, all
of which are thought to aid in the binding of PRC2 to target genes.
9
Several accessory proteins such as JARID2, Aebp2, and the Polycomb-like (PCL) proteins are
known to interact with the complex, and while they do not constitute part of the core complex
they have been shown to be necessary for proper functioning. In particular the PCL protein
PHF1 has been shown to interact with EZH2 and is necessary for trimethylation activity. The
absence of this protein results in mono/di-methylation of H3K27 and therefore inadequate
gene suppression [15]
. JARID2 has been shown to be necessary for recruitment of PRC2 to
target genes, and once again absence of this protein results in inadequate gene suppression
and a loss of the H3K27me3 on target genes [16]
.
Recent studies have shown that non-coding RNAs (ncRNAs) interact with the PRC2 complex
through the Suz12 subunit and may play a role in recruitment [17]
. Interestingly many of these
ncRNAs are transcribed from PcG repressed genes [17]
which may explain why certain genes
contain bivalent domains, while also answering the question as to how PRC2 knows which
genes to target.
Figure 1.6: The PRC2 complex and its associated chromatin interactions. EZH2, EED, SUZ12 and RbAp46/48 are
essential for PRC2 complex formation, while additional protein units bind to enhance its functioning. Putative
interactions with either DNA or histones that could explain PRC2 recruitment are highlighted [18]. Recent
developments have associated the binding of ncRNA to the Suz12 subunit, as opposed to the Ezh2 subunit as shown [17].
10
The vast array of known interactions between the PRC2 complex and accessory molecules
provides an insight into the complexity of the Ezh2 (PRC2) interactome, while also
establishing some of the key questions that need answering. Hence, in this project, I aim to
characterise the interactions of Ezh2 in ES cells.
1.4 Mass Spectrometry
Mass spectrometry (MS) has become an increasingly important tool in proteomics research in
recent times. Not only can it be used to devise the sequence of a protein or peptide but it can
also be used to detect the protein composition of a mixture using a technique known as
peptide mass fingerprinting (PMF).
A typical mass spectrometry experiment involves 3 main steps:
1. Ionisation; the sample is fragmented into smaller charged fragments. The two primary
techniques used to ionise the sample are electrospray ionisation (ESI) and matrix-
assisted laser desorption ionisation (MALDI).
2. Separation; the fragmented ions are then sorted based on their mass-to-charge ratio
(m/z). There are multiple techniques for this separation stage including time-of-flight
(TOF), quadropole mass filter, or ion traps.
3. Detection; the mass/charge ratio of the ions is detected and quantified to give a final
output in the form of a spectrum.
A variety of combinations of ionisation and separation methods can be used, depending on
what type of sample is being analysed. A common method for protein detection is MALDI-
TOF due to its high throughput properties.
11
Peptide mass fingerprinting (PMF) uses computer algorithms to analyse the MS spectrum and
match the detected fragment ions to a database of known or predicted peptide fragments to
predict the protein they belong to. Peptide mass fingerprinting has some limitations when
used with standard MS protocols, primarily with protein mixtures. PMF algorithms generally
assume all peptide fragments belong to the same protein, and expanding algorithms to match
fragments with more than one protein is quite difficult.
Typically this can be overcome by initial separation of protein samples using gel
electrophoresis combined with a method known as tandem mass spectrometry (MS/MS)
which involves two separation methods with an additional fragmentation cell between them.
Figure 1.7: Tandem Mass spectrometry diagram. A mixture of peptide fragments in initially separated by the first
separation unit and filtered into a collision cell. The collision cell uses inert gas molecules to fragment the peptides
further through collisions; the new fragments are then filtered into the second separation chamber and sorted by their
m/z ratio and passed onto the detector.
12
The use of tandem MS and PMF is extremely useful in the field of proteomics, especially in
relation to this project, as it allows quantification and analysis of large protein mixtures to
establish a database of potentially interacting proteins.
1.5 Aim
The aim of this study is to expand our knowledge of the current Ezh2 (PRC2) interactome,
with the hopes of identifying promising new areas of potential research that could deepen our
knowledge of the polycomb proteins and their effects within the cellular environment.
13
2. Materials and Methods
2.1 Immunoprecipitation
Immunoprecipitations were performed on nuclear protein lysates prepared in low-salt buffer
containing protease inhibitors (150 mM NaCl, 50 mM Tris-HCl, pH 8.0, 1 mM EDTA, 1%
(v/v) NP-40, 1 μg ml−1
aprotinin, 10 μg ml−1
leupeptin and 1 mM PMSF) [14]
.
Immunoprecipitations of Flag-tagged proteins were performed using M2 anti-Flag agarose
(Sigma) overnight at 4 °C. Elution of Flag-tagged proteins was performed at 4 °C using 250
μg ml−1
of 3× Flag peptide (Sigma) in 0.05% (v/v) NP-40 with horizontal shaking. Eluted
protein fractions were separated by SDS-PAGE and analysed by western blotting or liquid
chromatography MS [14]
. Immunoprecipitation samples were provided by the Bracken lab,
located in the Smurfit Institute of Genetics, Trinity College Dublin.
2.2 Gel electrophoresis
Immunoprecipitation samples were prepared for SDS-PAGE; each sample was made up with
equal parts protein solution and 2x sample buffer to give a final 1x sample buffer
concentration. Three protein samples of varying protein content; 10µg, 20µg, and 30µg, were
prepared according to table 2.1.
Once prepared the samples were boiled for 10 minutes at 96oC to fully denature the proteins.
Stock protein concentration: 5.25ug/ul
Desired protein
content (µg)
Stock Volume
(µl)
2x Sample Buffer
Volume (µl)
Final Volume (µl)
10 1.9 1.9 3.8
20 3.8 3.8 5.7
30 5.7 5.7 11.4 Table 2.1: SDS-PAGE sample preparation
14
SDS-PAGE was set up as follows; a 1X SDS-PAGE-Glycine running buffer was prepared
first as a 10X stock, and subsequently diluted 10-fold using ultrapure water to achieve the
desired 1X concentration (table 2.1).
10x Running Buffer (500ml)
Component Weight Water
Tris (Sigma) 15g Bring up to 500ml with
ultrapure water Glycine 72g
SDS 5g
1x Running Buffer (500ml)
Component Volume Water
10x Running Buffer stock 50ml Bring up to 500ml with
ultrapure water
Table 2.1: SDS-PAGE-Tris running buffer composition
A 7.5% resolving gel (table 2.3) was poured and left to set for 20-30 minutes, once set a 4.5%
stacking gel (table 2.3) was poured on top and a comb placed in the gel immediately then left
to set for 15-20 minutes. The SDS-PAGE apparatus was cleaned and assembled using the
prepared polyacrylamide gel, and the 1X running buffer was added.
4x Resolving Buffer (50ml)
Component Final Conc. Quantity pH Water
Tris 1.5M 9.0855g/50ml 8.8 Bring up to 50ml with
ultrapure water 10% SDS 0.4% 2ml
4x Stacking Buffer (50ml)
Component Final Conc. Quantity pH Water
Tris 0.5M 3.03/50ml 6.8 Bring up to 50ml with
ultrapure water 10% SDS 0.4% 2ml Table 2.2: Stock buffer recipes for PAGE gels
Resolving Gel (7.5%) Stacking Gel (4.5%)
Component Volume Component Volume
4x Resolving buffer 3.5ml 4x Stacking Buffer 3.75ml
40% Acrylamide 2.6ml 40% Acrylamide 3ml
Water 7.9ml Water 8.25ml
10% APS 82µl 10% APS 110ul
TEMED 7.8µl TEMED 15ul Table 2.3: Polyacrylamide gel recipes
15
The boiled immunoprecipitation samples were spun down using a microcentrifuge to remove
evaporation droplets and bubbles from the sample which could potentially impede loading of
the samples. The samples were loaded onto the gel according to table 2.4, a small amount of
sample buffer was loaded between protein samples to prevent cross contamination and to aid
even running of the samples during electrophoresis.
Lane 1 2 3 4 5 6 7 8 9 10
Content SB Mw SB 10ug SB 20ug SB 30ug SB SB
Volume 5µl 5µl 5µl 3.8µl 5µl 7.6µl 5µl 11.4µl 5µl 5µl
Table 2.4: SDS-PAGE gel lane contents. SB: Sample buffer (Invitrogen), Mw: Molecular weight markers (Thermo
Scientific Pageruled Prestained Protein Ladder)
The gel was run at 100V for 1 hour, or until the dye front was approximately ¾ of the way
down the gel. Once finished the gel was stained with Coomasie Blue (Fisher Scientific) for 20
minutes, excess dye was then washed off with dH2O and the gel was left to de-stain in dH2O
overnight on a shaker.
2.3 In-gel trypsin digestion
Buffer Concentration 200mM 100mM 50mM 20mM
H2O 50ml 50ml 50ml 50ml
NH4HCO3 0.8g 0.4g 0.2g 0.08g Table 2.5: NH4HCO3 Buffer Solution Concentrations
The workspace, tube racks, and plastic sheeting were cleaned thoroughly with freshly
prepared 70% ethanol; this was done regularly to avoid contamination. Lane 6 (20µg) was
excised from the gel with a sterile scalpel and cut into 8 pieces. Eight 0.5ml Eppendorf tubes
were washed with acetonitrile (aCN), one for each band of the gel. All buffers were made
fresh according to table 2.5.
16
70µl of 200mM NH4HCO3 (ammonium bicarbonate) was added to each of the eight
Eppendorf tubes (labelled 1-8, with 1 being the topmost section of the lane). Each section was
cut into ~1mm2 pieces and placed into the prepared Eppendorfs, as shown in figure 2.1.
Figure 3.1: Lane cutting workflow. The desired lane was excised from the gel, and then cut into eight sections. Each
section was cut into ~1mm2 pieces and placed in a clean 0.5ml Eppendorf with 70µl of 200mM NH4HCO3
The next step in the digestion process then consisted of several rounds of shrinking and
rehydrating of the gel to remove the protein from the gel pieces for trypsin digestion. Each
tube was treated in the same manner according to figures 2.2 to 2.9.
Figure 2.2: Washing of gel pieces to remove dye (tube images taken from http://www.clker.com; public clipart).
Mw 10µ
g
20µ
g
30µ
g
1 2 3 5 4 6 7 8 9 10
SB SB SB SB SB SB
1 3 2 4 5 6 7 8
17
Figure 2.3: Shrinking (dehydration) of gel pieces. Shrinking the gel with NH4HCO3 disrupts the interactions between
the protein and gel, continuous rounds of dehydration and rehydration at various concentrations of ammonium
bicarbonate will release the proteins from the gel.
Figure 2.4: Rehydration and shrinking of the gel pieces.
Figure 2.5: Rehydration and shrinking of gel spots.
18
Figure 2.6: Reduction and protection of cysteine residues. DTT is a strong reducing agent which will break the
disulphide bonds in the proteins contained in the gel. Iodoacetamide binds the free thiol groups exposed by the DTT
reduction to prevent reformation of the disulphide bonds.
Figure 2.7: Washing and shrinking of gel pieces.
Figure 2.8: Final shrinking of gel pieces
19
Figure 2.9: Trypsin digestion of proteins for peptide extraction. The trypsin was prepared using a Trypsin Stock kit
(Sigma); the dried trypsin (0.4µg) was solubilised using 5µl of the trypsin solubilisation reagent and vortexed to
ensure the powder was fully dissolved. 45µl of the trypsin reaction buffer was then added, making the final trypsin
concentration 20µg/ml.
2.4 Peptide Extraction
Component Buffer 1 Buffer 2
Acetonitrile 3.5ml (70%) 0.5ml (10%)
Formic Acid 0.4ml (4%) 0.05ml (0.1%)
dH2O 1.1ml 4.45ml
Table 2.6: Peptide extraction buffers
Samples were spun down using a desktop microcentrifuge for 10 seconds and the supernatant
was removed and added into new 0.5ml Eppendorf tubes. 30µl of buffer 1 was added to the
remaining gel pieces and left for 10 minutes to retrieve any remaining peptides from the gel
pieces. The samples were spun down and the supernatant was removed and added to the new
Eppendorf tubes. The peptide samples were then dried in a vacuum centrifuge at 60oC for 1.5
hours. The dried peptides were then resuspended in 10µl of buffer 2 and transferred into
labelled MS tubes.
20
2.5 Mass spectroscopy
An EASYnLC (Proxeon Biosystems), with an LTQ Orbitrap Classic mass spectrometer
(Thermo Scientific) and nano-electrospray ionisation source (Proxeon Biosystems) was used
to perform liquid chromatography-tandem mass spectrometry (LC-MS/MS). Mass
spectrometry analysis was carried out by Kieran Wynne in the Conway Institute Core Facility.
Peptides were separated on a 15cm reversed phase analytical column (75µm internal
diameter) in-house packed with 3 µm ReproSil-Pur C18AQ beads, using a 1hr gradient from
2-25% buffer B (99.9% MeCN/0.1% formic acid) at a flow rate of 0.4 µl/min.
Data was continuously collected in a data-dependent manner, with the Orbitrap collecting a
survey scan at 60,000 resolution with an automatic gain control (AGC) target of 1 x 106.
Collision induced dissociation (CID) M/MS scans followed, using the 10 most abundant ions
from the survey scan, with an AGC target of 5,000; signal threshold of 1,000; 2.0 Da isolation
width; and MS activation time at 35% normalised collision energy. Charge state screening
was used to reject unassigned or 1+ charge states. Dynamic exclusion was enabled to ignore
masses for 30 s that had previously selected for fragmentation.
2.4 Protein identification and quantification
The raw MS file was processed using MaxQuant on default settings (figure 2.10), with the
exception of carbamidomethyl as a fixed modification.
Figure 2.10: Default MaxQuant settings
21
3. Results
3.1 A proteomics approach to characterising the Ezh2 interactome
Figure 3.1 shows a schematic workflow of the approach taken in this project to provide a
clear and concise overview of the methods used and the reasoning behind each step. The
diagram shows how the initial Ezh2 immunoprecipitation was processed, through a
combination of lab based methods and bioinformatics, to provide the final output seen in table
3.1. As with all scientific experiments each step in the process had a specific aim:
1. Immunoprecipitation: Purification of the protein of interest (Ezh2), and interacting
proteins, from cell lysate.
2. SDS-PAGE: Separation of proteins by weight
3. In-gel Trypsin digestion: Removal of proteins from the gel and subsequent digestion
of proteins into small peptides for MS analysis.
4. Mass spectrometry: Detection and quantification of peptide ions.
5. Protein Mass Fingerprinting (MaxQuant analysis): Matching of peptide ions to known
and predicted MS/MS ion databases to generate a protein list.
6. Statistical Analysis (Perseus): Assignment of p-values to each protein hit to allow
separation of false positives from true interactors.
7. Bioinformatic analysis: Interpretation of data using various databases to generate
results and evidence for further research.
22
Figure 3.1: Workflow for analysis of the Ezh2 interactome:
1. A co-immunoprecipitation of Ezh2 from ES cell lysate was performed.
2. The protein samples were separated on an SDS-PAGE gel.
3. An in-gel trypsin digestion and peptide extraction was performed.
4. The peptide samples were analysed using LC-MS/MS.
The MS results were analysed using MaxQuant and peptide mass fingerprinting to obtain a list of proteins which
were subsequently used to form an interactome for Ezh2 using bioinformatics database analysis.
(SDS-PAGE apparatus image edited from; http://commons.wikimedia.org/wiki/File:SDS-PAGE_Electrophoresis.png)
(ES cell images taken from; http://en.wikipedia.org/wiki/File:Humanstemcell.JPG)
23
3.2 Statistical Analysis and Data Interpretation
The aim of the statistical analysis was to refine the Ezh2 true positive interactors from the
false positives through use of a student’s T-test.
The raw mass spectral files were initially analysed using MaxQuant, a quantitive proteomics
software designed for analysing large mass-spectrometric data sets [19][20]
. By using Perseus, a
statistical software package designed to perform downstream bioinformatics and statistics on
the MaxQuant output tables [20]
, it was possible to perform a student’s T-test to generate p-
values for each protein hit. Taking a confidence level of 95% to be adequate proteins with a p-
value of less than 0.5 were taken to be true interactors. Table 3.1 shows all protein hits with
P<0.05, to allow for easier interpretation of the total data set (found in appendix 1).
24
Gene Ezh2
Spectral
Control
Spectral
P-value Description
TITIN_MOUSE 382 0 1.22E-41 Plays a role in chromatin
condensation and segregation
GUF1_MOUSE 231 0 1.20E-25 Mitochondrial translation factor
PRP8_MOUSE 183 0 2.18E-20 Spliceosome component
SF3B1_MOUSE 150 0 9.82E-17 Splicing factor
SF3B3_MOUSE 142 0 5.90E-16 Splicing factor
DDX21_MOUSE 127 0 2.39E-14 RNA helicase
TOP2A_MOUSE 127 0 2.39E-14 DNA topoisomerase
SPT5H_MOUSE 126 0 4.16E-14 Transcription factor
SF3A2_MOUSE 106 0 3.88E-12 Splicing factor
SSRP1_MOUSE 101 0 1.56E-11 Component of the FACT complex
U5S1_MOUSE 98 0 4.20E-11 Component of the U5 snRNP
complex – pre-mRNA splicing
SMCA4_MOUSE 84 0 1.01E-09 (BRG1) Transcriptional co-activator
H2B1B_MOUSE 81 0 2.62E-09 Histone
H2B1C_MOUSE 81 0 2.62E-09 Histone
H2B1K_MOUSE 81 0 2.62E-09 Histone
DHX15_MOUSE 78 0 4.10E-09 Pre-mRNA splicing factor
K2C6B_MOUSE 78 0 4.10E-09 Keratin
H2A1K_MOUSE 75 0 1.02E-08 Histone
SUZ12_MOUSE 76 0 1.12E-08 Component of PRC2 complex
SMCA5_MOUSE 73 0 1.62E-08 Helicase – nucleosome remodelling
activity
SPT6H_MOUSE 73 0 1.62E-08 Elongation factor
EZH2_MOUSE 74 0 1.75E-08 Catalytic component of the PRC2
complex
ARI1A_MOUSE 72 0 2.72E-08 Chromatin Remodelling (nBAF)
SMG1_MOUSE 67 0 6.59E-08 Protein Kinase
H2AV_MOUSE 60 0 4.17E-07 Histone
LRP2_MOUSE 60 0 4.17E-07 LDL receptor related protein
ASH1L_MOUSE 61 0 4.44E-07 H3K36 methyltransferase
ASPM_MOUSE 61 0 4.44E-07 Mitotic spindle regulation
VP13C_MOUSE 61 0 4.44E-07 Vacuolar protein sorting-associated
protein 13C
EED_MOUSE 59 0 6.94E-07 Component of the PRC2 complex
CDC5L_MOUSE 56 0 1.06E-06 DNA binding protein involved in
cell cycle control
DDX42_MOUSE 56 0 1.06E-06 RNA helicase
H2B3B_MOUSE 56 0 1.06E-06 Histone
EIF3A_MOUSE 57 0 1.09E-06 Translation initiation factor
H2AY_MOUSE 57 0 1.09E-06 Histone
SMRC1_MOUSE 57 0 1.09E-06 Chromatin remodelling –
(SWI/SNF, WINAC, nBAF)
H2B1A_MOUSE 55 0 1.71E-06 Histone
SACS_MOUSE 55 0 1.71E-06 HSP70 chaperone
CHD7_MOUSE 53 0 2.70E-06 Chromodomain-helicase-DNA
25
binding protein 7
RPB1_MOUSE 53 0 2.70E-06 DNA directed RNA polymerase II
subunit RPB1
CD158_MOUSE 52 0 4.68E-06 Coiled coil domain containing
protein 158
DDX3X_MOUSE 52 0 4.68E-06 RNA helicase
ACINU_MOUSE 49 0 6.77E-06 Apototic chromatin condensation
EH1L1_MOUSE 50 0 7.28E-06
RB6I2_MOUSE 50 0 7.28E-06 Regulatory subunit of IKK complex
SPTA2_MOUSE 50 0 7.28E-06 Cytoskeleton protein
CROCC_MOUSE 47 0 1.08E-05 Cytoskeleton protein
DYHC1_MOUSE 47 0 1.08E-05 Motor protein
RS9_MOUSE 47 0 1.08E-05 Ribosomal protein
ACTA_MOUSE 48 0 1.14E-05 Actin
FXR1_MOUSE 45 0 1.72E-05 Fragile X mental retardation
syndrome-related protein 1
MBB1A_MOUSE 45 0 1.72E-05 Transcriptional activator/repressor –
interacts with DNA binding proteins
TBA1A_MOUSE 45 0 1.72E-05 Tubulin
RIF1_MOUSE 43 0 2.75E-05 Required for checkpoint mediated
arrest of cell cycle progression
SMC1A_MOUSE 43 0 2.75E-05 Chromosome cohesion during cell
cycle and DNA repair
DYHC2_MOUSE 44 0 2.80E-05 Dynein
EWS_MOUSE 44 0 2.80E-05 RNA binding protein – may be
repressive
RS2_MOUSE 44 0 2.80E-05 Ribosomal protein
SMCA2_MOUSE 44 0 2.80E-05 Transcriptional activator
(nBAF/WINAC)
DDX3Y_MOUSE 42 0 4.41E-05 RNA helicase
MYH10_MOUSE 42 0 4.41E-05 Myosin
PB1_MOUSE 42 0 4.41E-05 Chromatin remodellor – regulator of
cell proliferation
TCRG1_MOUSE 42 0 4.41E-05 RNA polII inhibitor
CHD4_MOUSE 40 0 6.98E-05 Histone deacetylase – NuRD
complex
MATR3_MOUSE 41 0 7.72E-05 Matrin – transcription factor
SRRM2_MOUSE 41 0 7.72E-05 Pre-mRNA splicing
BAZ1B_MOUSE 38 0 0.000110695 Tyrosine protein kinase – chromatin
remodelling
PRP6_MOUSE 38 0 0.000110695 Splicing factor
PSIP1_MOUSE 39 0 0.000120205 Cell differentiation, stress response
RPB2_MOUSE 39 0 0.000120205 DNA driven RNA polII
AQR_MOUSE 36 0 0.00017612 Splicing factor
EGF_MOUSE 36 0 0.00017612 Epidermal growth factor
IF4A3_MOUSE 36 0 0.00017612 RNA helicase
RIMS2_MOUSE 36 0 0.00017612 Rab effector involved in exocytosis
SMCA1_MOUSE 37 0 0.000187766 Component of the NURF complex
26
FMN2_MOUSE 34 0 0.000281027 Central nervous system protein
RBM25_MOUSE 34 0 0.000281027 RNA binding protein – splicing
factor
BAHC1_MOUSE 35 0 0.000294268 Coiled coil domain protein
CJ018_MOUSE 35 0 0.000294268
H2A2B_MOUSE 35 0 0.000294268 Histone
TIF1B_MOUSE 35 0 0.000294268 Transcription repression
DDX46_MOUSE 32 0 0.000449671 Splicing factor
ERC2_MOUSE 32 0 0.000449671
K1C18_MOUSE 32 0 0.000449671 Keratin
SFRS1_MOUSE 32 0 0.000449671 Splicing factor
SMU1_MOUSE 32 0 0.000449671
RBBP7_MOUSE 33 0 0.000462695 Component of the PRC2 complex
TCOF_MOUSE 33 0 0.000462695 May play a role in embryonic
development
ERC6L_MOUSE 30 0 0.000721432 DNA helicase
ILF3_MOUSE 30 0 0.000721432 Interleukin enhancer factor
MGAP_MOUSE 30 0 0.000721432
VWF_MOUSE 30 0 0.000721432 Von Willebrand factor
CP250_MOUSE 31 0 0.000729902 Centrosome cohesion
DKC1_MOUSE 31 0 0.000729902
DNMT1_MOUSE 31 0 0.000729902 Demethylase
KDM5B_MOUSE 31 0 0.000729902 H3K4 demethylase
KTN1_MOUSE 31 0 0.000729902 Kinesin
MAN1_MOUSE 31 0 0.000729902 TGF-b antagonist
MDC1_MOUSE 31 0 0.000729902 DNA repair
PGBM_MOUSE 31 0 0.000729902
SF3A1_MOUSE 554 12 1.26E-43 Splicing factor
SF3A3_MOUSE 195 10 1.18E-11 Splicing factor
PRP19_MOUSE 38 2 0.006524444 DNA repair
NPM_MOUSE 65 4 0.000386053 Histone assembly, cell proliferation,
histone chaperone
GPC4_MOUSE 30 2 0.031020003 Protoglycan
RBBP4_MOUSE 43 3 0.0071263 Component of PRC2 complex
H31_MOUSE 69 5 0.000627128 Histone
SP16H_MOUSE 135 11 3.03E-06 FACT complex protein
RS14_MOUSE 54 5 0.00752452 Ribosomal protein
RS11_MOUSE 49 5 0.020706808 Ribosomal protein
DDX5_MOUSE 278 29 5.76E-09 RNA helicase
FBRL_MOUSE 68 8 0.012257672 Pre-rRNA processing
FLNB_MOUSE 73 9 0.011268778 Cytoskeleton protein
NUCL_MOUSE 104 13 0.002483551 Chromatin decondensation
RS3A_MOUSE 72 9 0.015189771 Ribosomal protein
RS4X_MOUSE 70 9 0.020304002 Ribosomal protein Table 3.1: Mass spectrometry data set showing only hits with a significance of P<0.05. Count 1 refers to the
experimental sample while count 2 refers to the control (IgG) sample.
27
3.3 Bioinformatics Analysis
Analysis of the refined data set using UniProt (Universal Protein Resource), a comprehensive
resource for protein sequence and annotation data [21]
, suggested that the sample was enriched
in proteins associated with several core cellular activities including splicing, transcription and
chromatin remodelling.
Figure 3.2: Relative abundance of proteins associated with the PRC2 complex (dark blue), splicing (orange),
transcription (red) and chromatin remodelling (purple), according to UniProt database mining.
Processing of the refined data set using DAVID (The Database for Annotation, Visualization
and Integrated Discovery), a bioinformatics database which contains a set of functional
annotation tools useful for analysing large data sets [22][23]
, provided confirmation that the
abbreviated data set was enriched in the biological functions initially demonstrated by
UniProt database mining.
PRC2
Splicing Factors
Transcription Machinery
Chromatin Remodellors
Other
28
Figure 3.3: Functional Annotation analysis of the abbreviated data set (table 3.1), showing key enriched functions
(adapted from DAVID output table, appendix 2)
Isolation of smaller subsets of proteins was then carried out to allow for an in-depth analysis
of certain interest groups; primarily splicing, transcription, and chromatin remodelling as the
data set contained a high proportion of proteins associated with each of these functions (figure
3.2, figure 3.3). Each group was isolated and protein counts were graphed to give some
insight into the abundance of certain proteins that could be of interest.
STRING analysis was also performed on each subset to provide a visual representation of the
known interactome. STRING is a database of known and predicted protein interactions. The
interactions include direct (physical) and indirect (functional) associations; they are derived
from four sources; genomic context, high-throughput experiments, co-expression, and
previous knowledge (database mining) [24]
.
0 10 20 30 40 50 60 70 80 90 100
Methyltransferase
Methylation
Chromatin Regulator
RNA-binding
DNA-binding
Transcription Regulation
Spliceosome
Acetylation
Phosphoprotein
Nucleus
Proteins in Functional Annotation Category (%)
29
Figure 3.4: Expression levels of PRC2 proteins in the data set.
Figure 3.5: STRING network diagram of the EZH2 local interactome. The coloured lines refer to the type of evidence
in place to support the specific interaction.
01020304050607080
Pro
tein
Co
un
t
Ezh2 IP
Control
Gene Ezh2
Spectral
Control
Spectral
P-value Description
SUZ12_MOUSE 76 0 1.12E-08 Contains a zinc finger domain for
DNA/RNA binding.
EZH2_MOUSE 74 0 1.75E-08 The catalytic subunit of the PRC2
complex. SET domain has
methyltransferase activity.
EED_MOUSE 59 0 6.94E-07 Post-translational modification
binding
RBBP7_MOUSE 33 0 0.000462695 Histone binding
RBBP4_MOUSE 43 3 0.0071263 Histone binding
Table 3.2: PRC2 proteins isolated from the total data set.
30
Table 3.2 shows the PRC2 components detected in the data set, while figure 3.4 indicates the
relative abundance of each based on their protein count generated by MaxQuant. The
STRING network in figure 3.5 was generated by the programme itself using only Ezh2 as an
input; this is the known and predicted interactome of Ezh2 according to the STRING
database. However not all known protein interactions are present in the network which
suggests that STRING has some limitations and that networks generated by the programme
should be viewed with a critical eye.
Table 3.3 contains the protein hits with known, or predicted, splicing activity which were
detected in the Ezh2 pull-down sample. The protein counts have been graphed to give a visual
representation of the data (figure 3.6). Two proteins; SF3A1 and DDX5 have particularly high
counts, a UniProt database search for each of these proteins reveals that SF3A1 is a subunit of
a splicing factor (SF3A) which is responsible for assembly of the ‘A’ complex of the
spliceosome, while DDX5 is a regulator of pre-mRNA splicing in the cell. Both proteins seem
to have important activity in regards to spliceosome structure and function.
Functional annotation analysis of the subset (figure 3.7) shows enrichment in helicase
activity, RNA processing, and spliceosome proteins keeping in line with the original
assignment of these proteins to a splicing sub-group.
The STRING network for this sub-group (figure 3.8) shows that the majority of the proteins
are located in an interaction-rich core which represents the spliceosome and its associated
accessory proteins. Interaction between the Rbbp7 subunit of the PRC2 complex and Prpf19,
an intergral part of the spliceosome, indicates potential cross-talk between the two. However,
it should be noted though that Rbbp7 is known to belong to several other protein complexes,
including the chromatin remodelling complexes NuRD [25]
, and therefore interactions with it
do not necessarily infer interactions with the PRC2 complex.
31
Although due to the abundance of splicing proteins in the sample the Rbbp7 subunit of the
PRC2 complex may act as a bridge between Ezh2 and the spliceosome, conferring a
functional relationship between the two complexes.
Gene Ezh2
Spectral
Control
Spectral
P-value Description
AQR_MOUSE 36 0 0.00017612 Splicing factor
CDC5L_MOUSE 56 0 1.06E-06 Component of the PRP19-CDC5L
complex, integral part of the
spliceosome.
DDX46_MOUSE 32 0 0.000449671 Splicing factor
DHX15_MOUSE 78 0 4.10E-09 Putative pre-mRNA-splicing factor
ATP-dependent RNA helicase
HNRPM_MOUSE 38 24 0.003373485 Splicing factor
PRP19_MOUSE 38 2 0.006524444 DNA repair. Pre mRNA splicing –
associated with CDC5L
PRP6_MOUSE 38 0 0.000110695 Splicing factor
PRP8_MOUSE 183 0 2.18E-20 Central component of spliceosome
RBM25_MOUSE 34 0 0.000281027 RNA binding protein – splicing
factor
SF3A1_MOUSE 554 12 1.26E-43 Splicing factor
SF3A2_MOUSE 106 0 3.88E-12 Splicing factor
SF3A3_MOUSE 195 10 1.18E-11 Splicing factor
SF3B1_MOUSE 150 0 9.82E-17 Splicing factor
SF3B3_MOUSE 142 0 5.90E-16 Splicing factor
SFRS1_MOUSE 32 0 0.000449671 Splicing factor
SFRS7_MOUSE 60 48 3.40E-07 Splicing factor
SRRM2_MOUSE 41 0 7.72E-05 Involved in pre-mRNA splicing
U5S1_MOUSE 98 0 4.20E-11 Component of the U5 snRNP
complex required for pre-mRNA
splicing
DDX42_MOUSE 56 0 1.06E-06 RNA helicase
RPB2_MOUSE 39 0 0.000120205 RNA helicase
DDX5_MOUSE 278 29 5.76E-09 Probable RNA helicase
DDX3X_MOUSE 52 0 4.68E-06 RNA helicase
DDX3Y_MOUSE 42 0 4.41E-05 Probable RNA helicase
DDX21_MOUSE 127 0 2.39E-14 Probable RNA helicase
DDX17_MOUSE 101 11 0.000858146 Probable RNA helicase
Table 3.3: Splicing associated proteins isolated from the total data set.
32
Figure 3.7: Functional Annotation analysis of splicing proteins
0
100
200
300
400
500
600
AQ
R_
MO
US
E
CD
C5
L_
MO
US
E
DD
X46
_M
OU
SE
DH
X15
_M
OU
SE
HN
RP
M_
MO
US
E
PR
P1
9_M
OU
SE
PR
P6
_M
OU
SE
PR
P8
_M
OU
SE
RB
M2
5_M
OU
SE
SF
3A
1_
MO
US
E
SF
3A
2_
MO
US
E
SF
3A
3_
MO
US
E
SF
3B
1_M
OU
SE
SF
3B
3_M
OU
SE
SF
RS
1_
MO
US
E
SF
RS
7_
MO
US
E
SR
RM
2_
MO
US
E
U5
S1
_M
OU
SE
DD
X42
_M
OU
SE
DD
X5_
MO
US
E
DD
X21
_M
OU
SE
DD
X3X
_M
OU
SE
DD
X3Y
_M
OU
SE
DD
X17
_M
OU
SE
Pro
tein
Co
un
t
Ezh2 IP
Control
0 10 20 30 40 50 60 70 80 90 100
mRNA splicing
Spliceosome
RNA processing
RNA binding
Helicase
Proteins in Functional Annotation Category (%)
Figure 3.6: Protein count levels for the isolated splicing associated proteins.
33
Figure 3.8: STRING network diagram of the detected splicing proteins and the PRC2 complex (circled).
34
Figure 3.9: Expression levels of transcription associated proteins in the data set.
0
20
40
60
80
100
120
140
Pro
tein
Co
un
t
Ezh2 IP
Control
Gene Ezh2
Spectral
Control
Spectral
P-value Description
RPB1_MOUSE 53 0 2.70E-06 DNA-directed RNA polymerase
II subunit RPB1
SPT5H_MOUSE 126 0 4.16E-14 Transcription factor
CHD7_MOUSE 53 0 2.70E-06 Chromodomain helicase DNA
binding protein 7
MBB1A_MOUSE 45 0 1.72E-05 May activate or repress
transcription (HDAC activity)
TCRG1_MOUSE 42 0 4.41E-05 Possible Transcription factor
MATR3_MOUSE 41 0 7.72E-05 Matrin 3
PSIP1_MOUSE 39 0 0.000120205 Transcriptional Co-activator
RPB2_MOUSE 39 0 0.000120205 DNA driven RNA pol II
TIF1B_MOUSE 35 0 0.000294268 DNA helicase
ERC6L_MOUSE 30 0 0.000721432 DNA driven RNA pol II
Table 3.4: Transcription associated proteins isolated from the data set.
35
Figure 3.10: Functional Annotation Analysis of transcription associated proteins
Figure 3.11: STRING network diagram of the transcription associated proteins and the PRC2 complex.
0 10 20 30 40 50 60 70 80 90 100
Acetylation
DNA binding
Transcription
Phosphoprotein
RNA pol II, core complex
Regulation of Transcription
Proteins in Functional Annotation Category (%)
36
Table 3.4 indicates the proteins directly associated with transcription machinery (as opposed
to epigenetic transcriptional regulators which can be found in the chromatin remodelling sub-
group) and, as per previous sub-groups, figure 3.9 indicates the protein counts of each hit.
One protein in particular, Spt5h, a regulator of transcriptional elongation by RNA pol II has a
high protein count and the highest p-value of the listed proteins suggesting that it is definitely
present in the experimental sample.
Further analysis of the data set showed that two components of DNA-directed RNA
polymerase II were detected in the sample; RPB1 (Polr1a) and RPB2 (Polr2a), as well as
Spt5h (Supt5h), and the transcription elongation factor TCRG1 (Tcerg1) all of which are
directly involved in the RNA elongation during transcription. According to the STRING
network (figure 3.11) these proteins may be interacting with the PRC2 complex through the
matrin 3 protein, whose function has yet to be fully determined but is thought to play a role in
transcription. These interactions may provide an insight into alternative methods of
transcriptional silencing exerted by the PRC2 complex.
37
Figure 3.12: Expression levels of chromatin remodelling associated proteins in the data set.
0
20
40
60
80
100
120
140
Pro
tein
Co
un
t
Ezh2 IP
Control
Gene Ezh2
Spectral
Control
Spectral
P-value Description
SMCA4_MOUSE 84 0 1.01E-09 Transcriptional activator BRG-1
SMCA5_MOUSE 73 0 1.62E-08 Helicase with ATP-dependent
nucleosome-remodeling activity
ARI1A_MOUSE 72 0 2.72E-08 Component of the nBAF remodelling
complex
PB1_MOUSE 42 0 4.41E-05 Transcriptional activator/repressor.
Negative regulator of cell proliferation
BAZ1B_MOUSE 38 0 0.00011069
5
Tyrosine-protein kinase
SMRC1_MOUSE 57 0 1.09E-06 SWI/SNF component
SMCA2_MOUSE 44 0 2.80E-05 Component of the SWI/SNF, WINAC
and nBAF complexes
SP16H_MOUSE 135 11 3.03E-06 Transcriptional activator.
nBAF/WINAC component
NUCL_MOUSE 104 13 0.00248355
1
FACT complex component, histone
chaperone
SSRP1_MOUSE 101 0 1.56E-11 FACT complex component
Table 3.5: Chromatin remodelling proteins isolated from the total data set.
38
Figure 3.13: Functional Annotation analysis of the isolated chromatin remodelling proteins.
Figure 3.14: STRING network diagram of chromatin remodelling proteins and the PRC2 complex.
0 10 20 30 40 50 60 70 80 90 100
Chromatin Remodeling Complex
SWI/SNF complex
Transcription Regulation
Chromatin Assembly/Disassembly
Proteins in Functional Annotation Category (%)
39
As with previous sub-groups the initial tables and figures (table 3.5, figure 3.12) refer to the
isolated proteins and show the relative abundance of each protein. Figure 3.13 shows the
functional annotation of the sub-group proteins, indicating enrichment in transcriptional
regulation activity and chromatin remodelling proteins.
The chromatin remodelling components detected are primarily associated with the FACT
(facilitates chromatin transcription) and SWI/SNF complexes, with SUPT16H and SSRP1
being the two main protein subunits of the FACT complex [26]
and many of the other proteins
being SWI/SNF related. The FACT complex is associated with active transcription, and
therefore interactions with Ezh2 and the PRC2 complex may be inhibitory to further the effect
of transcriptional silencing exerted by the PRC2 complex.
Once again the proteins seem to be interacting with PRC2 through the Rbbp7 subunit (figure
3.14) which, as previously mentioned, is a transient interactor of PRC2 and several other
protein complexes.
Figure 3.15 shows a STRING network combining the three sub-groups (splicing,
transcription, chromatin remodelling) and PRC2 to give a visual overview of the known and
potential interactions between all four groups. The resulting map shows many interactions
between components of all four groups, which provides good evidence for functional
associations between Ezh2 (PRC2) and each of these groups. However further experiments
would be required to prove and characterise these interactions and their downstream effects.
40
Figure 3.15: STRING network diagram of all three subgroups (splicing, chromatin remodelling, and transcription)
and the PRC2 complex. Thicker lines indicate a stronger confidence in the interaction.
41
4. Discussion
4.1 The current Ezh2 interactome
Several key interactions involving Ezh2 have already been characterised including that with
PHF1 [15]
, cell cycle regulators [28]
, and DNA methyltransferases (DNMTs) [29]
. It’s clear from
these studies that Ezh2 and the PRC2 complex have an intricate web of interactors involved in
the maintenance of pluripotency and cell differentiation. Due to the importance of Ezh2 and
PRC2 for regulation of these processes any dysfunction in their activity can have severe
downstream consequences, primarily tumourigenesis [4][5][6]
but mutations in Ezh2 are also
known to be the cause of Weaver Syndrome, a congenital disease associated with general
overgrowth starting in the pre-natal stage [31]
. The association of mutated Ezh2 with cancer
has been extensively studied, including therapeutic strategies involving inhibition of Ezh2 in
cases of over-expression [32][33]
. Ezh2 has also been suggested as a potential biomarker for
certain cancers [35][36]
, however high levels of Ezh2 are associated with poor prognosis and
late stage cancers which may undermine the effectiveness of the protein as a biomarker. The
importance of Ezh2 for stem cell health makes characterisation of its interactions and
downstream effects vastly important in understanding the mechanisms by which stem cells
are regulated.
4.2 Large Scale Proteomic Analysis of Protein-Protein Interactions
Large scale proteomic approaches have proved to be extremely useful in the past; however it
is important to understand the limitations of such approaches in order to critically assess the
results obtained from them. One of the prime concerns in any large scale proteomics project is
the complexity of the proteome itself, post-translational modifications and alternative splicing
42
can cause major differences in the protein content and activity between cells. What may be
true for one cell type under certain conditions may not be true for most other cell types. It is
for this reason that large scale analysis, like that carried out in the experiment, can be used to
detect global motifs and functional enrichments in a protein population but must be followed
up by smaller, more specific experimental analysis of select subsets of proteins. Western blot
analysis, yeast two-hybrid methods and fluorescent resonance energy transfer (FRET)
experiments are three potential follow up experiments which can be used to clarify protein-
protein interactions. Once an interaction has been clarified it’s then possible to proceed with
experiments to determine the type of interactions involved and the effects of said interactions.
RNAi technology, conditional knock-outs, and generation of mutant animal models are all
useful in determining the effect of one protein-protein interactions, by interfering or removing
a protein it’s possible to observe its effect on other proteins through alterations in phenotype,
proteomic composition, gene expression etc.
4.3 Global analysis of the Ezh2 interactome data
Global analysis of the data set shows a vast interactome, with several interaction of interest. A
complex network of interactions seems to be occurring between transcription factors, splicing
proteins and chromatin remodelling proteins (figure 3.15), which may point towards
mechanisms of genomic regulation not yet characterised in regards the PRC2 complex.
The PRC2 complex already induces downstream chromatin remodelling though its interaction
with PRC1, therefore it’s possible that alternate interactions with other chromatin remodelling
complexes may also be used to induce chromatin condensation and prevent transcription. The
chromatin remodelling complex FACT consists of two subunits; Supt16 and SSRP1 [26]
, both
of which were detected in the data set with high confidence and in high abundance (table 3.5).
43
The FACT complex is known to affect RNA polymerase II, the key protein involved in
transcription, by removing a H2A-H2B histone dimer to allow RNA pol II to access the
desired regions of DNA contained within the nucleosome [36]
. It has been shown that the
H2AK119ub1 mark induced by the PRC1 complex prevents the binding of the FACT
complex to target genes and inhibits transcription [30]
; therefore the PRC2 complex has
downstream effects on the FACT complex. However this doesn’t explain why both FACT
subunits; SPT16 and SSRP1, are present in the data set.
Two subunits of the RNA pol II complex (RPB1 (Polr1a) and RPB2 (Polr2a)), as well as two
transcription elongation factors (TCRG1 (Tcerg1) and Spt5h (Supt5h)) were found in the data
set providing evidence for interplay between PRC2, FACT, and RNA pol II. However the low
number of transcription factors in the data set wouldn’t provide enough evidence to support
further research into the area. In light of the known interactions between FACT and RNA pol
II it is possible that PRC2 has an indirect effect on RNA pol II through FACT inhibition. It’s
also possible that PRC2 interacts with RNA pol II to receive a strand of ncRNA which it uses
to target specific genes [17]
, although a definite function for the interaction between PRC2 and
ncRNA is still uncharacterised.
4.3 Splicing and the PRC2 complex
It’s clear from the results that a large amount of splicing factors were contained in the sample,
primarily belonging to the spliceosome or DEAD-box containing RNA helicase family (DDX
proteins). This would suggest a functional relationship between the spliceosome and the
PRC2 complex, however there are no characterised interactions between the two found in the
literature. The spliceosome is essential for modification of transcribed RNA into mRNA
through the excision of introns (non-coding regions) and ligation of the remaining exons
44
(coding regions) into a translatable mRNA construct. Without the removal of these introns the
transcribed RNA cannot be translated into a protein and therefore a gene will have no protein
product expressed, and therefore no cellular effect. If a relationship between the spliceosome
and the PRC2 complex does exist it would undoubtedly be quite complex, and it is currently
unclear as to whether the two share an antagonistic relationship but it is possible that the
PRC2 complex inhibits the activity of the spliceosome preventing processing of pre-mRNA
into translatable mRNA, thus ultimately suppressing gene expression.
In order for this to be a viable avenue for further research it would first be necessary to
provide additional evidence to support the idea that PRC2 and the spliceosome are indeed
interacting. Subsequent pull-downs paired with Western blot analysis would be the simplest
way to achieve this. By probing for both Ezh2 and a core component of the spliceosome, such
as PRP9, PRP18, or CDC5L, it would be possible to show if the two are pulled down together
in subsequent co-immunoprecipitation experiments and therefore warrant further
investigation.
If the Western analysis was successful and indeed showed that PRC2 and the spliceosome are
pulled down together it would then be necessary to establish how the two interact and to what
effect. Knock-down of Ezh2 using RNAi technology would prevent the formation of the
PRC2 complex as Ezh2 is essential for PRC2 stability, if the two do interact this will have an
effect on splicing within the cell. The splicing activity in a cell can be measured via mRNA
levels using next generation sequencing methods such as Illumina; however it would be
difficult to determine whether the change in mRNA levels is due to increased spliceosome
activity in the absence of PRC2 or purely because transcription of certain genes is no longer
repressed by PRC2.
45
A potential solution to this problem would be the use of the Reverse Transcriptase-
Polymerase Chain Reaction (RT-PCR) methods, which would amplify the levels of specific
pre-mRNA and mRNA in a sample. If the PRC2 complex was inhibiting spliceosome activity
on certain genes we would see more of the pre-mRNA from those genes present, as opposed
to the processed mRNA product. Upon knock-down of Ezh2 the levels of pre-mRNA and
mRNA should shift if the spliceosome in that region is no longer being inhibited. A schematic
overview of the proposed experiments can be seen in figure 4.1 and figure 4.2.
Figure 4.1: Active inhibition of the spliceosome by PRC2 would result in higher levels of pre-mRNA compared to
processed mRNA, which can be detected through a combination of RT-PCR and agarose gel electrophoresis.
46
Figure 4.2: No inhibition of the spliceosome by PRC2 would result in an increase in processed mRNA levels compared
to pre-mRNA
Due to the size difference between pre-mRNA and processed mRNA it is possible to detect
alterations in pre-mRNA processing using agarose gel electrophoresis to separate the
molecules based on their size. Processed mRNA is generally shorted than its pre-mRNA
precursor due to the excision of introns by the spliceosome, and therefore travels further in an
electrophoresis gel experiment. These properties can be exploited for the sake of the proposed
experiments in order to detect changes in spliceosomal activity in PRC2+ and PRC2
-
environments.
47
4.4 Conclusion
Overall it is clear from this project, and from the literature, that the interactome of Ezh2 and
the Polycomb proteins is diverse and complicated. The effects of Ezh2 and PRC2 on gene
repression has been well characterised, as well as some of the molecular mechanisms
involved. However further research is required to understand the full extent of PRC2s effect
on transcription and gene expression.
Further research into this field could provide new and alternative therapeutic targets for
cancer treatment, and potentially refine our understanding of stem cell mechanics with
downstream effects on the use of stem cells in regenerative medicine.
5. Acknowledgements
I’d first like the thank Gerard Cagney for the opportunity to work with him and his team on
this project, and for all his help and advice along the way. I’d also like to convey my
appreciation to the members of Dr. Cagney’s group for their support and helpful advice
throughout this project; a big thank you to Dr. Aisling Robinson for teaching me the protocols
and answering all my questions, and to Giorgio Oliviero, Ariane Watson, Guillermo
Gambero, and Nayla Munawar for all their guidance and advice. Lastly I’d like to thank Dr.
Adrian Bracken (TCD) and his group for providing me with Ezh2 immunoprecipitation
samples, without which this project would not have been possible.
Word Count: 6892
48
6. Bibliography
1. Takahashi, K., Yamanaka, S. (2006) ‘Induction of Pluripotent Stem Cells from Mouse
Embryonic and Adult Fibroblast Cultures by Defined Factors’. Cell 126: 663-676
2. Boyer, L., Plath, K. et al. (2006) ‘Polycomb Complexes Repress Developmental
Regulators in Murine Embryonic Stem Cells’. Nature 441: 349-353
3. Sauvageau, M., Sauvageau, G. (2010). ‘Polycomb Group Proteins: Multi-Faceted
Regulators of Somatic Stem Cells and Cancer’. Cell Stem Cell. 7(3):299-313.
4. Simon, J.A, Lange, C.A. (2008) ‘Roles of the EZH2 histone methyltransferase in cancer
epigenetics’. Mutation Research/Fundamental and Molecular Mechanisms of
Mutagenesis 647: 21-29.
5. Chang, C-J., Hung, M-C. (2011) ‘The Role of EZH2 in Tumour Progression’. British
Journal of Cancer. 106:243-247.
6. Chase, A., Cross, N.C.P. (2011) ‘Aberrations of EZH2 in Cancer’. Clinical Cancer
Research. 17:2613-2618.
7. Luger, K., Richmond, T.J. (1998) ‘The Histone Tails of the Nucleosome’. Current
Opinion in Genetics and Development. 8:140-146.
8. Strahl, B.D., Allis, C.D. (2000) ‘The Language of Covalent Histone Modifications’.
Nature. 403:41-45.
9. Vastenhouw, N., Schier, A. (2012) ‘Bivalent Histone Modifications in Early
Embryogenesis’. Current Opinion in Cell Biology. 24:374-386.
10. Ekwall K., Olsson T., Turner BM., Cranston G., Allshire RC. ( 1997). ‘Transient
inhibition of histone deacetylation alters the structural and functional imprint at fission
yeast centromeres’. Cell. 91:1021–1032.
49
11. VerMilyea, M., O’Neill, L., Turner, B. (2009). ‘Transcription-Independent Heritability of
Induced Histone Modifications in the Mouse Preimplantation Embryo’. PLoS One. 6(4)
12. Faust, C., Schumacher, A., Holdener, B., Maguson, T. (1995) ‘The EED mutation disrupts
anterior mesoderm production in mice’. Development. 121: 273-285.
13. O’Carroll, D., Erhardt, S., Pagini, M., Barton, S.C., Surani, M.A, Jenuwein, T. (2001).
‘The polycomb-group gene EZH2 is required for early mouse development’. Mol. Cell
Biol. 21: 4330-4336.
14. Pasini, D., Bracken, A.P., Jensen, M.R., Lazzerini Denchi, E., Helin, K. (2004). ‘Suz12 is
essential for mouse development and Ezh2 methyltransferase activity’. EMBO J. 23:
4061-4071.
15. Sarma, K., Margueron, R., Ivanov, A., Pirrotta, V., Reinberg, D. (2008). ‘Ezh2 requires
PHF1 to efficiently catalyse H3 lysine 27 trimethylation in vivo’. Mol. Cell Biol. 28(8):
2718-2731
16. Pasini, D., Cloos, P.A., Walfridsson, J. et al. (2010) ‘JARID2 Regulates Binding of the
Polycomb Repressive Complex 2 to Target Genes in ES Cells’. Nature 464: 306-310
17. Kanhere, A., Viiri, K., Araújo, C.C. et al. (2010) ‘Short RNAs are Transcribed from
Repressed Polycomb Target Genes and Interact with Polycomb Repressive Complex-2’.
Molecular Cell 38(5): 675-688
18. Margueron, R., Reinberg, D. (2011) ‘The Polycomb complex PRC2 and its mark in life’.
Nature 469: 343-349
19. Cox, J. & Mann, M. (2008) ‘MaxQuant enables high peptide identification rates,
individualized p.p.b.-range mass accuracies and proteome-wide protein quantification’.
Nature Biotechnol. 26: 1367–1372.
20. MaxQuant. Available at: http://www.Maxquant.org/index.htm [accessed 08/02/13]
50
21. UniProt (Universal Protein Resource). Available at: http://www.uniprot.org/help/about
[accessed 10/02/13].
22. Huang DW, Sherman BT, Lempicki RA. (2009) ‘Systematic and integrative analysis of
large gene lists using DAVID Bioinformatics Resources’. Nature Protoc. 4(1):44-57.
23. Huang DW, Sherman BT, Lempicki RA. (2009) ‘Bioinformatics enrichment tools: paths
toward the comprehensive functional analysis of large gene lists’. Nucleic Acids Res.
37(1):1-13.
24. STRING. Available at: http://www.string-db.org [accessed: 18/02/13]
25. Zhang, Y., Ng, H-H, et al. (1999) ‘Analysis of the NuRD subunits reveals a histone
deacetylase core complex and a connection with DNA metylation’. Genes and
Development 13: 1924-1935.
26. Orphanides, G., Wu, W.H., Lane, W.S., Hampsey, M., Reinberg, D. (1999) ‘The
chromatin-specific transcription elongation factor FACT comprises human SPT16 and
SSRP1 proteins’. Nature 400(6741): 284-288.
27. Allis, C.D. (Ed.), Jenuwein, T. (Ed.), Reinberg, D. (Ed.), Caparros, M.L. (Ed.) (2007)
‘Epigenetics’. New York: Cold Spring Harbour Laboratory Press
28. Kaneko, S., Li, G., Son, J. et al (2010) ‘Phosphorylation of the PRC2 component Ezh2 is
cell cycle-regulated and up-regulates its binding to ncRNA’. Genes & Development 24:
2615-2620.
29. Viré, E., Brenner, C., Deplus, R. et al. (2006) ‘The Polycomb Group Protein EZH2
Directly Controls DNA Methylation’. Nature 439(7078): 871-874.
30. Zhou, W., Zhu, P., Wang, J., Pascual, G. et al. (2008) ‘Histone H2A Monoubiquitination
Represses Transcription by Inhibiting RNA Polymerase II Transcriptional Elongation’.
Mol. Cell 29(1): 69-80.
51
31. Gibson, W.T., Hood, R., Zhan, S.H. et al. (2012) ‘Mutations in Ezh2 Cause Weaver
Syndrome’. American Journal of Human Genetics 90(1): 110-118.
32. Kikuchi J. et al. (2012) ‘Epigenetic Therapy with 3-deazaneplanocin A, an Inhibitor of the
Histone Methyltransferase EZH2, Inhibits Growth of Non-Small Cell Lung Cancer Cells’.
Lung Cancer. 78(2):138-143.
33. Crea, F., Fornaro, L., Bocci, G. et al. (2012) ‘EZH2 inhibition: targeting the crossroad of
tumor invasion and angiogenesis’. Cancer Metastasis Review 31(3-4): 753-761.
34. Hajósi-Kalcakosz, S., Dezsó, K., Bodor, S. et al (2012) ‘Enhancer of zeste homologue 2
(Ezh2) is a reliable immunohistochemical marker to differentiate malignant and benign
hepatic tumors’. Diagnostic Pathology 7: 86.
35. Cai, M., Tong, Z., Zheng, F. et al. (2011) ‘EZH2 protein: a promising immunomarker for
the detection of hepatocellular carcinomas in liver needle biopsies’ Gut 60: 967-976.
36. Reinberg, D., Sims, R.J. (2006) ‘deFACTo Nucleosome Dynamics’. The Journal of
Biological Chemistry 281: 23297-23301.
52
7. Appendices
Appendix 1
Gene Ezh2
Spectral
Control
Spectral
P-value Odds Ratio
TITIN_MOUSE 382 0 1.22E-41 NA
GUF1_MOUSE 231 0 1.20E-25 NA
PRP8_MOUSE 183 0 2.18E-20 NA
SF3B1_MOUSE 150 0 9.82E-17 NA
SF3B3_MOUSE 142 0 5.90E-16 NA
DDX21_MOUSE 127 0 2.39E-14 NA
TOP2A_MOUSE 127 0 2.39E-14 NA
SPT5H_MOUSE 126 0 4.16E-14 NA
SF3A2_MOUSE 106 0 3.88E-12 NA
SSRP1_MOUSE 101 0 1.56E-11 NA
U5S1_MOUSE 98 0 4.20E-11 NA
SMCA4_MOUSE 84 0 1.01E-09 NA
H2B1B_MOUSE 81 0 2.62E-09 NA
H2B1C_MOUSE 81 0 2.62E-09 NA
H2B1K_MOUSE 81 0 2.62E-09 NA
DHX15_MOUSE 78 0 4.10E-09 NA
K2C6B_MOUSE 78 0 4.10E-09 NA
H2A1K_MOUSE 75 0 1.02E-08 NA
SUZ12_MOUSE 76 0 1.12E-08 NA
SMCA5_MOUSE 73 0 1.62E-08 NA
SPT6H_MOUSE 73 0 1.62E-08 NA
EZH2_MOUSE 74 0 1.75E-08 NA
ARI1A_MOUSE 72 0 2.72E-08 NA
SMG1_MOUSE 67 0 6.59E-08 NA
H2AV_MOUSE 60 0 4.17E-07 NA
LRP2_MOUSE 60 0 4.17E-07 NA
ASH1L_MOUSE 61 0 4.44E-07 NA
ASPM_MOUSE 61 0 4.44E-07 NA
VP13C_MOUSE 61 0 4.44E-07 NA
EED_MOUSE 59 0 6.94E-07 NA
CDC5L_MOUSE 56 0 1.06E-06 NA
DDX42_MOUSE 56 0 1.06E-06 NA
H2B3B_MOUSE 56 0 1.06E-06 NA
EIF3A_MOUSE 57 0 1.09E-06 NA
H2AY_MOUSE 57 0 1.09E-06 NA
SMRC1_MOUSE 57 0 1.09E-06 NA
H2B1A_MOUSE 55 0 1.71E-06 NA
SACS_MOUSE 55 0 1.71E-06 NA
53
CHD7_MOUSE 53 0 2.70E-06 NA
RPB1_MOUSE 53 0 2.70E-06 NA
CD158_MOUSE 52 0 4.68E-06 NA
DDX3X_MOUSE 52 0 4.68E-06 NA
ACINU_MOUSE 49 0 6.77E-06 NA
EH1L1_MOUSE 50 0 7.28E-06 NA
RB6I2_MOUSE 50 0 7.28E-06 NA
SPTA2_MOUSE 50 0 7.28E-06 NA
CROCC_MOUSE 47 0 1.08E-05 NA
DYHC1_MOUSE 47 0 1.08E-05 NA
RS9_MOUSE 47 0 1.08E-05 NA
ACTA_MOUSE 48 0 1.14E-05 NA
FXR1_MOUSE 45 0 1.72E-05 NA
MBB1A_MOUSE 45 0 1.72E-05 NA
TBA1A_MOUSE 45 0 1.72E-05 NA
RIF1_MOUSE 43 0 2.75E-05 NA
SMC1A_MOUSE 43 0 2.75E-05 NA
DYHC2_MOUSE 44 0 2.80E-05 NA
EWS_MOUSE 44 0 2.80E-05 NA
RS2_MOUSE 44 0 2.80E-05 NA
SMCA2_MOUSE 44 0 2.80E-05 NA
DDX3Y_MOUSE 42 0 4.41E-05 NA
MYH10_MOUSE 42 0 4.41E-05 NA
PB1_MOUSE 42 0 4.41E-05 NA
TCRG1_MOUSE 42 0 4.41E-05 NA
CHD4_MOUSE 40 0 6.98E-05 NA
MATR3_MOUSE 41 0 7.72E-05 NA
SRRM2_MOUSE 41 0 7.72E-05 NA
BAZ1B_MOUSE 38 0 0.000110695 NA
PRP6_MOUSE 38 0 0.000110695 NA
PSIP1_MOUSE 39 0 0.000120205 NA
RPB2_MOUSE 39 0 0.000120205 NA
AQR_MOUSE 36 0 0.00017612 NA
EGF_MOUSE 36 0 0.00017612 NA
IF4A3_MOUSE 36 0 0.00017612 NA
RIMS2_MOUSE 36 0 0.00017612 NA
SMCA1_MOUSE 37 0 0.000187766 NA
FMN2_MOUSE 34 0 0.000281027 NA
RBM25_MOUSE 34 0 0.000281027 NA
BAHC1_MOUSE 35 0 0.000294268 NA
CJ018_MOUSE 35 0 0.000294268 NA
H2A2B_MOUSE 35 0 0.000294268 NA
TIF1B_MOUSE 35 0 0.000294268 NA
DDX46_MOUSE 32 0 0.000449671 NA
ERC2_MOUSE 32 0 0.000449671 NA
K1C18_MOUSE 32 0 0.000449671 NA
54
SFRS1_MOUSE 32 0 0.000449671 NA
SMU1_MOUSE 32 0 0.000449671 NA
RBBP7_MOUSE 33 0 0.000462695 NA
TCOF_MOUSE 33 0 0.000462695 NA
ERC6L_MOUSE 30 0 0.000721432 NA
ILF3_MOUSE 30 0 0.000721432 NA
MGAP_MOUSE 30 0 0.000721432 NA
VWF_MOUSE 30 0 0.000721432 NA
CP250_MOUSE 31 0 0.000729902 NA
DKC1_MOUSE 31 0 0.000729902 NA
DNMT1_MOUSE 31 0 0.000729902 NA
KDM5B_MOUSE 31 0 0.000729902 NA
KTN1_MOUSE 31 0 0.000729902 NA
MAN1_MOUSE 31 0 0.000729902 NA
MDC1_MOUSE 31 0 0.000729902 NA
PGBM_MOUSE 31 0 0.000729902 NA
SF3A1_MOUSE 554 12 1.26E-43 13.22822078
SF3A3_MOUSE 195 10 1.18E-11 5.587371232
PRP19_MOUSE 38 2 0.006524444 5.444105303
NPM_MOUSE 65 4 0.000386053 4.656142694
GPC4_MOUSE 30 2 0.031020003 4.297977871
RBBP4_MOUSE 43 3 0.0071263 4.106956632
H31_MOUSE 69 5 0.000627128 3.954139641
SP16H_MOUSE 135 11 3.03E-06 3.516527349
RS14_MOUSE 54 5 0.00752452 3.094544067
RS11_MOUSE 49 5 0.020706808 2.808012209
DDX5_MOUSE 278 29 5.76E-09 2.746753674
IMB1_MOUSE 37 4 0.059257196 2.650419687
DDX17_MOUSE 101 11 0.00085814594989 2.630883424
FBRL_MOUSE 68 8 0.012257672 2.435520794
FLNB_MOUSE 73 9 0.011268778 2.324091738
NUCL_MOUSE 104 13 0.002483551 2.292254865
RS3A_MOUSE 72 9 0.015189771 2.292254865
RS4X_MOUSE 70 9 0.020304002 2.228581118
RS8_MOUSE 44 6 0.089011007 2.101233626
GBLP_MOUSE 65 10 0.069847922 1.862457077
RS18_MOUSE 57 9 0.102930169 1.814701768
HNRPF_MOUSE 35 6 0.345661769 1.671435839
VIGLN_MOUSE 81 14 0.083375425 1.657791465
RS15A_MOUSE 46 8 0.250104603 1.647558184
DHX9_MOUSE 82 15 0.112667001 1.566374157
ANXA1_MOUSE 38 7 0.36919869 1.555458658
RS3_MOUSE 105 21 0.161310175 1.43265929
RU2A_MOUSE 30 6 0.548116498 1.43265929
HNRH1_MOUSE 34 7 0.572424102 1.391726168
H2AX_MOUSE 58 12 0.387046223 1.384903981
55
ROA3_MOUSE 42 9 0.502837955 1.337148671
TOP2B_MOUSE 51 11 0.447403643 1.328465887
RSSA_MOUSE 74 16 0.373511099 1.325209844
FLNC_MOUSE 35 8 0.713823077 1.253576879
RSMB_MOUSE 38 9 0.726594357 1.209801179
RSMN_MOUSE 38 9 0.726594357 1.209801179
LAMA5_MOUSE 37 9 0.858980928 1.177964305
H4_MOUSE 102 26 0.670001306 1.12408652
PUF60_MOUSE 35 9 0.857949026 1.114290559
SFPQ_MOUSE 35 9 0.857949026 1.114290559
ANXA6_MOUSE 34 9 1 1.082453686
RS16_MOUSE 33 9 1 1.050616813
FLNA_MOUSE 91 25 0.911237699 1.042975963
SMD2_MOUSE 40 11 1 1.041934029
H2A2A_MOUSE 73 21 1 0.996039316
DDX3L_MOUSE 48 14 1 0.982394942
HNRPU_MOUSE 91 27 0.911565935 0.965718485
LMNB1_MOUSE 76 23 0.808907924 0.946800922
H2B2B_MOUSE 82 25 0.815625424 0.939824494
MYH14_MOUSE 58 18 0.782344563 0.92326932
HNRPC_MOUSE 32 10 0.85256351 0.916901946
HNRPD_MOUSE 30 10 0.703513746 0.859595574
UBIQ_MOUSE 30 10 0.703513746 0.859595574
HSP7C_MOUSE 86 29 0.432880647 0.849715165
CLH_MOUSE 30 11 0.456280468 0.781450522
XPO2_MOUSE 30 11 0.456280468 0.781450522
GCAA_MOUSE 119 46 0.090772873 0.741245459
HNRPK_MOUSE 43 17 0.276253283 0.724757053
H2B2E_MOUSE 58 23 0.182766355 0.722558599
HSP72_MOUSE 40 16 0.261187524 0.716329645
COQ6_MOUSE 30 12 0.352429381 0.716329645
PLEC1_MOUSE 102 41 0.069881298 0.712835354
H12_MOUSE 35 17 0.093012218 0.589918531
H13_MOUSE 35 17 0.093012218 0.589918531
TBA1B_MOUSE 45 23 0.028046307 0.560605809
ARHG1_MOUSE 33 17 0.059577029 0.556208901
DPOLZ_MOUSE 30 17 0.033260467 0.505644455
LMNA_MOUSE 49 28 0.005541794 0.501430752
SPTB2_MOUSE 43 25 0.007837104 0.492834796
MYH11_MOUSE 59 36 0.000721125 0.469593879
ANXA2_MOUSE 39 24 0.005630124 0.465614269
HNRPM_MOUSE 38 24 0.003373485 0.453675442
IGKC_MOUSE 81 53 8.88E-06 0.437907179
TBB5_MOUSE 36 25 0.00103211 0.412605876
SFRS7_MOUSE 60 48 3.40E-07 0.358164823
EPIPL_MOUSE 40 33 1.46E-05 0.347311343
56
TERA_MOUSE 33 28 4.26E-05 0.337698261
CCD25_MOUSE 228 199 1.12E-27 0.328287757
MYH9_MOUSE 41 36 2.81E-06 0.326327949
K2C8_MOUSE 47 42 2.61E-07 0.320642794
EF1A1_MOUSE 35 33 2.48E-06 0.303897425
ACTG_MOUSE 71 67 1.89E-11 0.303638238
ACTB_MOUSE 70 67 1.57E-11 0.299361643
K1C19_MOUSE 61 64 2.40E-12 0.273100677
K2C72_MOUSE 34 36 1.19E-07 0.270613422
ACTBL_MOUSE 33 36 8.61E-08 0.262654203
K1C42_MOUSE 72 84 1.97E-17 0.245598735
K2C7_MOUSE 36 42 1.47E-09 0.245598735
ACTC_MOUSE 48 57 1.27E-12 0.241289986
K22O_MOUSE 67 81 1.61E-17 0.237007833
K22E_MOUSE 55 67 6.97E-15 0.235212719
K2C6A_MOUSE 82 100 2.26E-21 0.234956124
K1C16_MOUSE 55 68 2.83E-15 0.231753709
K2C74_MOUSE 38 49 7.51E-12 0.22220838
K2C5_MOUSE 95 125 4.75E-28 0.217764212
K1C17_MOUSE 73 98 8.54E-23 0.213436996
K2C4_MOUSE 49 66 5.76E-16 0.212728198
K1C14_MOUSE 85 118 8.29E-28 0.206400067
K1C15_MOUSE 64 93 3.85E-23 0.197183214
K1C10_MOUSE 105 156 4.46E-38 0.192857981
K2C75_MOUSE 74 110 1.69E-27 0.192757795
K2C71_MOUSE 32 48 4.89E-13 0.191021239
K2C79_MOUSE 38 58 1.28E-15 0.187727769
K2C1_MOUSE 53 81 3.62E-21 0.187483808
K1C13_MOUSE 62 98 7.45E-26 0.181275257
K2C73_MOUSE 50 85 7.75E-24 0.168548152
PLAK_MOUSE 31 53 1.51E-15 0.167594106
K2C1B_MOUSE 31 55 1.87E-16 0.161499775
KPYM_MOUSE 82 163 1.59E-48 0.144144861
Appendix 1: Total data set obtained from MaxQuant and Perseus analysis.
Appendix 2
57
58
59
60
61
62
63
64
Appendix 2: DAVID functional annotation analysis of the abbreviated data set (table 3.1)
65
Appendix 3: Global STRING analysis of the abbreviated data set (table 3.1). Nodes with no interactions have been
removed and remaining nodes have been clustered using MCL=4 means. STRING clustering uses a global interaction
score and clusters nodes with high interaction scores together. The PRC2 complex is circled in red.