Characterising the Interactome of EZH2 in Embryonic Stem Cells (3)

Characterising the Interactome of

EZH2 in ES Cells

Daire Murphy

09329510

Gerard Cagney

BIOC40090

22/02/2013

1

Summary

The aim of this study was to characterise and develop the Ezh2 interactome using Ezh2

immunoprecipitation samples from embryonic stem (ES) cells. Ezh2 is the core catalytic

subunit of the PRC2 complex, which is associated with the maintenance of pluripotency and

self-renewal properties in stem cells through epigenetic modifications. Proper functioning of

the PRC2 complex is required for healthy stem cell differentiation and dysfunction of this

complex, whether through mutations in Ezh2 or its other subunits, has been implicated in

tumourigenesis. A mass spectrometric approach was used to identify and quantify the

complex protein mixture obtained from immunoprecipitation experiments in order to establish

an experimental interactome. In-depth analysis of the mass spectrometry results was

performed to ascertain a list of potential high-confidence interactors and use of several

bioinformatics tools provided an insight into complexity of the Ezh2 interactome and

potential new research avenues.

The experimental interactome was seemingly enriched in chromatin remodelling proteins and

transcriptional regulators, which was to be expected due to the nature of the PRC2 complex.

An enrichment in splicing factors was also observed, particularly those belonging to the

spliceosome, which provoked further analysis. An association between the spliceosome and

the PRC2 complex has yet to be characterised in the literature thus far, therefore an

experimental approach for characterising this interaction was developed to provide a platform

for further research into the topic. Advances in this field could potentially provide an insight

into how differentiation of stem cells is controlled while maintaining stemness, and how

aberrations in this process can cause the development of tumour cells.

2

Table of Contents

Summary 1

1. Introduction 3

1.1 Stem Cell Characteristics 3

1.2 Chromatin Structure and the Epigenetic code 4

1.3 EZH2 and the Polycomb Group Proteins 7

1.4 Mass Spectrometry 10.

1.5 Aim 12

2. Materials and Methods 13

2.1 Immunoprecipitation 13

2.2 Gel Electrophoresis 13.

2.3 In-gel Trypsin Digestion 15.

2.4 Peptide Extraction 19.

2.5 Mass Spectrometry 20.

2.6 Protein Identification 20

3. Results 21

4. Discussion 42.

4.1 The Current Ezh2 Interactome 42.

4.2 Large Scale Proteomic Analysis of Protein-Protein Interactions 42.

4.3 Global analysis of the Ezh2 interactome data 43.

4.4 Splicing and the PRC2 complex 44.

4.5 Conclusions 47.

5. Acknowledgements 47.

6. Bibliography 48.

7. Appendices 52.

3

1. Introduction

1.1 Stem Cells Characteristics

Embryonic stem (ES) cells, which are derived from the inner lining of mammalian

blastocysts, are important for tissue homeostasis and damage repair and can be found in most

tissues throughout the body. Stem cells are characterised by their ability to replicate while

still maintaining an undifferentiated state. Each stem cell has the potential to differentiate into

any of the three germ layers; endoderm, ectoderm, or mesoderm, which is known as

pluripotency. Maintaining pluripotency is a key process in stem cells and is regulated both in

the traditional genetic sense and in an epigenetic manner. Four key genes have been identified

that are necessary for the maintenance of pluripotency in stem cells; Oct4, Sox2, Klf4, and c-

Myc [1]

. It has also been observed that regular somatic cells can be transformed back into a

pluripotent state through activation of these genes; these cells are known as induced

pluripotent stem cells (iPSC) and have several promising applications within the field of

regenerative medicine [1]

.

Figure 1.1: Pluripotent stem cell life cycle; each stem cell has the potential to differentiation into a multitude of cell

types. (Image taken from: http://www.stemcellresearchfoundation.org/WhatsNew/Pluripotent.htm)

4

Pluripotency in stem cells is also regulated by epigenetic factors, including proteins belonging

to the Polycomb group (PcG) which act as transcriptional repressors. It has been observed that

the PcG protein complexes PRC1 and PRC2 are co-localised on many genes essential for

differentiation in both murine and human stem cells [2]

. Many of the developmental PcG

target genes have also been shown to be bound by key pluripotency transcription factors –

Oct4, Sox2, and Nanog, which suggests collaboration between the PcG proteins and the

essential pluripotency gene products identified by Tamanaka et al. [1][2]

Characterisation of the interactions between the genetic and epigenetic elements of stem cell

pluripotency maintenance holds the potential for answering key questions about cell

maturation and the downstream effects of dysfunctional regulation of pluripotency which has

been implicated in several disease states, primarily cancer [4][5][6][7]

.

1.2 Chromatin Structure and the Epigenetic Code

In order to fully understand the mechanisms by which the PcG proteins affect transcription

and their interactions with key pluripotency regulators it is important to first understand how

chromatin is ordered and how the PcG proteins are known to interact and alter this structure.

Chromatin is a combination of proteins and DNA which interact

to compact DNA into a stable and manageable structure. The

main protein components are called histones which combine to

form octameric units which the DNA wraps around to form a

nucleosome, the primary subunit of chromatin.

Figure 1.2: The multiple levels of

DNA compaction and Chromatin

structure.

5

This highly compacted structure allows huge amounts of DNA to be stored in each cell while

also protecting the DNA from damage and strengthening it for mitosis. The level of

compaction is also an important factor in regulating gene expression as compacted DNA

cannot be accessed by transcription machinery.

Figure 1.3: Nucleosome Structure. Each nucleosome consists of two H3 histones, two H4 histones, two H2A and two

H2B histones to form an octameric unit. Each histone has a tail structure which extends outwards from the core

nucleosome unit. (Taken from: ‘Epigenetics’, Allis et al. (2007))

Epigenetic modification of nucleosomes is extremely important for transcriptional regulation.

The amino-terminals, or histone tails, are the primary target for modification; they are located

outside of the main histone core and are thought to be largely unstructured due to their lack of

visibility in crystal structures [7]

. Evidence suggests that the tails are largely responsible for

the compaction of nucleosomes into chromatin [7]

; therefore modification of these tails can

alter chromatin structure. The tails and adjacent regions are also known to be target areas for

transcription machinery [7]

. These tails are rich in lysine and arginine residues which can be

modified in multiple ways including methylation, acetylation, and ubiquitination. Chromatin

can exist in two states: heterochromatin; which is condensed and transcriptionally inactive,

and euchromatin; which is open and transcriptionally active.

6

Both states are associated with specific histone-tail modifications, or histone marks; high

levels of methylation are usually associated with transcriptionally silent genes while

acetylation and phosphorylation are associated with active genes. However there are

exceptions and it seemingly comes down to a combination of the type of mark and the

position it’s placed in, as opposed to just the type of mark.

Figure 1.4: Different epigenetic marks associated with euchromatin and heterochromatin. Methylation marks are

associated with condensed, inactive chromatin while acetylation marks are associated with open, active chromatin.

(Taken from: ‘Epigenetics’, Allis et el. (2007))

The histone code hypothesis states that distinct histone modifications, on one or more tails,

act sequentially or in combination to form a code that is read by other proteins to bring about

distinct downstream events [8]

. One of the primary downstream events targeted by histone

modifications is transcription, which can be both repressed or activated by these marks. Both

repression and activation of transcription are associated with known histone marks, for

example the H3K27me3 (histone H3 lysine 27 trimethylation) mark is known to silence

transcription, while the H3K4me3 (histone H3 lysine 4 trimethylation) mark is associated

with activation of transcription. The Polycomb group and the Trithorax group (TrxG) proteins

are well known transcriptional regulators which are responsible for transcriptional silencing

and activation respectively. Both groups of proteins use histone modifications to alter the

7

structure of the chromatin and its interacting proteins and have well established links with ES

cell differentiation and pluripotency.

Interestingly both these marks, H3K27me3 and H3K4me3, can be present on the same gene,

which brings about the concept of bivalent domains. Bivalent domains contain both activating

and repressive marks, which may seem contradictory but are thought to maintain specific

genes in a poised state [9]

, initially repressing gene expression through the dominant

H3K27me3 mark while also maintaining the ability to quickly activate transcription through

the removal of the H3K27me3 mark leaving just the activating H3K4me3 mark.

The histone code also plays an important role in cellular memory outside of canonical DNA

heritability as histone modifications have been seen to be passed on through cell lines [10][11]

,

these modifications are thought to be important in cell identity as it potentially provides each

cell with a genome specific to its function.

1.3 Ezh2 and the Polycomb Group Proteins

The polycomb group proteins (PcG) are a family of proteins associated with silencing of gene

expression through chromatin remodelling. PRC1 and PRC2 are multiprotein complexes

belonging to the PcG family which work together to repress target genes. Repression of target

genes by the PRC complexes is a multistep process starting with the PRC2 complex. The

PRC2 complex trimethylates histone H3 lysine 27 (H3K27) through its catalytic EZH2

subunit, which contains a methyltransferase SET domain.

8

The trimethylation of H3K27 is associated with transcriptional silencing and it is known to

recruit the PRC1 complex which then ubiquitinylates histone H2A lysine 119 (H2AK119)

which is thought to stabilise chromatin condensation and subsequent transcriptional

repression.

Figure 1.5: Interaction of PRC1 and PRC2 to ultimately silence transcription. PRC2 binds the PRC response element

(PRE) and trimethylates H3K27 which acts as a binding platform for PRC1 which causes chromatin condensation

and transcriptional silencing. (Modified from: ‘Epigenetics’, Allis et al. (2007))

The PRC2 complex itself consists of four core components; Enhancer of Zeste homologue 2

(EZH2), Suppressor of Zeste 12 (SUZ12), Extra-embryonic endoderm (EED), and

RbAp46/48. It has been observed through multiple knock-out experiments that removal of

any of the EZH2, SUZ12, or EED subunits inactivates the PRC2 complex and disrupts ES cell

differentiation and embryonic development[12][13][14]

. As previously mentioned EZH2 is the

core catalytic subunit, generating the H3K27me3 mark via its SET methyltransferase domain.

The SUZ12 and EED subunits contain a zinc-finger DNA/RNA binding domain and a

methyl-lysine binding domain respectively, while RbAp46/48 are histone binding proteins, all

of which are thought to aid in the binding of PRC2 to target genes.

9

Several accessory proteins such as JARID2, Aebp2, and the Polycomb-like (PCL) proteins are

known to interact with the complex, and while they do not constitute part of the core complex

they have been shown to be necessary for proper functioning. In particular the PCL protein

PHF1 has been shown to interact with EZH2 and is necessary for trimethylation activity. The

absence of this protein results in mono/di-methylation of H3K27 and therefore inadequate

gene suppression [15]

. JARID2 has been shown to be necessary for recruitment of PRC2 to

target genes, and once again absence of this protein results in inadequate gene suppression

and a loss of the H3K27me3 on target genes [16]

.

Recent studies have shown that non-coding RNAs (ncRNAs) interact with the PRC2 complex

through the Suz12 subunit and may play a role in recruitment [17]

. Interestingly many of these

ncRNAs are transcribed from PcG repressed genes [17]

which may explain why certain genes

contain bivalent domains, while also answering the question as to how PRC2 knows which

genes to target.

Figure 1.6: The PRC2 complex and its associated chromatin interactions. EZH2, EED, SUZ12 and RbAp46/48 are

essential for PRC2 complex formation, while additional protein units bind to enhance its functioning. Putative

interactions with either DNA or histones that could explain PRC2 recruitment are highlighted [18]. Recent

developments have associated the binding of ncRNA to the Suz12 subunit, as opposed to the Ezh2 subunit as shown [17].

10

The vast array of known interactions between the PRC2 complex and accessory molecules

provides an insight into the complexity of the Ezh2 (PRC2) interactome, while also

establishing some of the key questions that need answering. Hence, in this project, I aim to

characterise the interactions of Ezh2 in ES cells.

1.4 Mass Spectrometry

Mass spectrometry (MS) has become an increasingly important tool in proteomics research in

recent times. Not only can it be used to devise the sequence of a protein or peptide but it can

also be used to detect the protein composition of a mixture using a technique known as

peptide mass fingerprinting (PMF).

A typical mass spectrometry experiment involves 3 main steps:

1. Ionisation; the sample is fragmented into smaller charged fragments. The two primary

techniques used to ionise the sample are electrospray ionisation (ESI) and matrix-

assisted laser desorption ionisation (MALDI).

2. Separation; the fragmented ions are then sorted based on their mass-to-charge ratio

(m/z). There are multiple techniques for this separation stage including time-of-flight

(TOF), quadropole mass filter, or ion traps.

3. Detection; the mass/charge ratio of the ions is detected and quantified to give a final

output in the form of a spectrum.

A variety of combinations of ionisation and separation methods can be used, depending on

what type of sample is being analysed. A common method for protein detection is MALDI-

TOF due to its high throughput properties.

11

Peptide mass fingerprinting (PMF) uses computer algorithms to analyse the MS spectrum and

match the detected fragment ions to a database of known or predicted peptide fragments to

predict the protein they belong to. Peptide mass fingerprinting has some limitations when

used with standard MS protocols, primarily with protein mixtures. PMF algorithms generally

assume all peptide fragments belong to the same protein, and expanding algorithms to match

fragments with more than one protein is quite difficult.

Typically this can be overcome by initial separation of protein samples using gel

electrophoresis combined with a method known as tandem mass spectrometry (MS/MS)

which involves two separation methods with an additional fragmentation cell between them.

Figure 1.7: Tandem Mass spectrometry diagram. A mixture of peptide fragments in initially separated by the first

separation unit and filtered into a collision cell. The collision cell uses inert gas molecules to fragment the peptides

further through collisions; the new fragments are then filtered into the second separation chamber and sorted by their

m/z ratio and passed onto the detector.

12

The use of tandem MS and PMF is extremely useful in the field of proteomics, especially in

relation to this project, as it allows quantification and analysis of large protein mixtures to

establish a database of potentially interacting proteins.

1.5 Aim

The aim of this study is to expand our knowledge of the current Ezh2 (PRC2) interactome,

with the hopes of identifying promising new areas of potential research that could deepen our

knowledge of the polycomb proteins and their effects within the cellular environment.

13

2. Materials and Methods

2.1 Immunoprecipitation

Immunoprecipitations were performed on nuclear protein lysates prepared in low-salt buffer

containing protease inhibitors (150 mM NaCl, 50 mM Tris-HCl, pH 8.0, 1 mM EDTA, 1%

(v/v) NP-40, 1 μg ml−1

aprotinin, 10 μg ml−1

leupeptin and 1 mM PMSF) [14]

.

Immunoprecipitations of Flag-tagged proteins were performed using M2 anti-Flag agarose

(Sigma) overnight at 4 °C. Elution of Flag-tagged proteins was performed at 4 °C using 250

μg ml−1

of 3× Flag peptide (Sigma) in 0.05% (v/v) NP-40 with horizontal shaking. Eluted

protein fractions were separated by SDS-PAGE and analysed by western blotting or liquid

chromatography MS [14]

. Immunoprecipitation samples were provided by the Bracken lab,

located in the Smurfit Institute of Genetics, Trinity College Dublin.

2.2 Gel electrophoresis

Immunoprecipitation samples were prepared for SDS-PAGE; each sample was made up with

equal parts protein solution and 2x sample buffer to give a final 1x sample buffer

concentration. Three protein samples of varying protein content; 10µg, 20µg, and 30µg, were

prepared according to table 2.1.

Once prepared the samples were boiled for 10 minutes at 96oC to fully denature the proteins.

Stock protein concentration: 5.25ug/ul

Desired protein

content (µg)

Stock Volume

(µl)

2x Sample Buffer

Volume (µl)

Final Volume (µl)

10 1.9 1.9 3.8

20 3.8 3.8 5.7

30 5.7 5.7 11.4 Table 2.1: SDS-PAGE sample preparation

14

SDS-PAGE was set up as follows; a 1X SDS-PAGE-Glycine running buffer was prepared

first as a 10X stock, and subsequently diluted 10-fold using ultrapure water to achieve the

desired 1X concentration (table 2.1).

10x Running Buffer (500ml)

Component Weight Water

Tris (Sigma) 15g Bring up to 500ml with

ultrapure water Glycine 72g

SDS 5g

1x Running Buffer (500ml)

Component Volume Water

10x Running Buffer stock 50ml Bring up to 500ml with

ultrapure water

Table 2.1: SDS-PAGE-Tris running buffer composition

A 7.5% resolving gel (table 2.3) was poured and left to set for 20-30 minutes, once set a 4.5%

stacking gel (table 2.3) was poured on top and a comb placed in the gel immediately then left

to set for 15-20 minutes. The SDS-PAGE apparatus was cleaned and assembled using the

prepared polyacrylamide gel, and the 1X running buffer was added.

4x Resolving Buffer (50ml)

Component Final Conc. Quantity pH Water

Tris 1.5M 9.0855g/50ml 8.8 Bring up to 50ml with

ultrapure water 10% SDS 0.4% 2ml

4x Stacking Buffer (50ml)

Component Final Conc. Quantity pH Water

Tris 0.5M 3.03/50ml 6.8 Bring up to 50ml with

ultrapure water 10% SDS 0.4% 2ml Table 2.2: Stock buffer recipes for PAGE gels

Resolving Gel (7.5%) Stacking Gel (4.5%)

Component Volume Component Volume

4x Resolving buffer 3.5ml 4x Stacking Buffer 3.75ml

40% Acrylamide 2.6ml 40% Acrylamide 3ml

Water 7.9ml Water 8.25ml

10% APS 82µl 10% APS 110ul

TEMED 7.8µl TEMED 15ul Table 2.3: Polyacrylamide gel recipes

15

The boiled immunoprecipitation samples were spun down using a microcentrifuge to remove

evaporation droplets and bubbles from the sample which could potentially impede loading of

the samples. The samples were loaded onto the gel according to table 2.4, a small amount of

sample buffer was loaded between protein samples to prevent cross contamination and to aid

even running of the samples during electrophoresis.

Lane 1 2 3 4 5 6 7 8 9 10

Content SB Mw SB 10ug SB 20ug SB 30ug SB SB

Volume 5µl 5µl 5µl 3.8µl 5µl 7.6µl 5µl 11.4µl 5µl 5µl

Table 2.4: SDS-PAGE gel lane contents. SB: Sample buffer (Invitrogen), Mw: Molecular weight markers (Thermo

Scientific Pageruled Prestained Protein Ladder)

The gel was run at 100V for 1 hour, or until the dye front was approximately ¾ of the way

down the gel. Once finished the gel was stained with Coomasie Blue (Fisher Scientific) for 20

minutes, excess dye was then washed off with dH2O and the gel was left to de-stain in dH2O

overnight on a shaker.

2.3 In-gel trypsin digestion

Buffer Concentration 200mM 100mM 50mM 20mM

H2O 50ml 50ml 50ml 50ml

NH4HCO3 0.8g 0.4g 0.2g 0.08g Table 2.5: NH4HCO3 Buffer Solution Concentrations

The workspace, tube racks, and plastic sheeting were cleaned thoroughly with freshly

prepared 70% ethanol; this was done regularly to avoid contamination. Lane 6 (20µg) was

excised from the gel with a sterile scalpel and cut into 8 pieces. Eight 0.5ml Eppendorf tubes

were washed with acetonitrile (aCN), one for each band of the gel. All buffers were made

fresh according to table 2.5.

16

70µl of 200mM NH4HCO3 (ammonium bicarbonate) was added to each of the eight

Eppendorf tubes (labelled 1-8, with 1 being the topmost section of the lane). Each section was

cut into ~1mm2 pieces and placed into the prepared Eppendorfs, as shown in figure 2.1.

Figure 3.1: Lane cutting workflow. The desired lane was excised from the gel, and then cut into eight sections. Each

section was cut into ~1mm2 pieces and placed in a clean 0.5ml Eppendorf with 70µl of 200mM NH4HCO3

The next step in the digestion process then consisted of several rounds of shrinking and

rehydrating of the gel to remove the protein from the gel pieces for trypsin digestion. Each

tube was treated in the same manner according to figures 2.2 to 2.9.

Figure 2.2: Washing of gel pieces to remove dye (tube images taken from http://www.clker.com; public clipart).

Mw 10µ

g

20µ

g

30µ

g

1 2 3 5 4 6 7 8 9 10

SB SB SB SB SB SB

1 3 2 4 5 6 7 8

17

Figure 2.3: Shrinking (dehydration) of gel pieces. Shrinking the gel with NH4HCO3 disrupts the interactions between

the protein and gel, continuous rounds of dehydration and rehydration at various concentrations of ammonium

bicarbonate will release the proteins from the gel.

Figure 2.4: Rehydration and shrinking of the gel pieces.

Figure 2.5: Rehydration and shrinking of gel spots.

18

Figure 2.6: Reduction and protection of cysteine residues. DTT is a strong reducing agent which will break the

disulphide bonds in the proteins contained in the gel. Iodoacetamide binds the free thiol groups exposed by the DTT

reduction to prevent reformation of the disulphide bonds.

Figure 2.7: Washing and shrinking of gel pieces.

Figure 2.8: Final shrinking of gel pieces

19

Figure 2.9: Trypsin digestion of proteins for peptide extraction. The trypsin was prepared using a Trypsin Stock kit

(Sigma); the dried trypsin (0.4µg) was solubilised using 5µl of the trypsin solubilisation reagent and vortexed to

ensure the powder was fully dissolved. 45µl of the trypsin reaction buffer was then added, making the final trypsin

concentration 20µg/ml.

2.4 Peptide Extraction

Component Buffer 1 Buffer 2

Acetonitrile 3.5ml (70%) 0.5ml (10%)

Formic Acid 0.4ml (4%) 0.05ml (0.1%)

dH2O 1.1ml 4.45ml

Table 2.6: Peptide extraction buffers

Samples were spun down using a desktop microcentrifuge for 10 seconds and the supernatant

was removed and added into new 0.5ml Eppendorf tubes. 30µl of buffer 1 was added to the

remaining gel pieces and left for 10 minutes to retrieve any remaining peptides from the gel

pieces. The samples were spun down and the supernatant was removed and added to the new

Eppendorf tubes. The peptide samples were then dried in a vacuum centrifuge at 60oC for 1.5

hours. The dried peptides were then resuspended in 10µl of buffer 2 and transferred into

labelled MS tubes.

20

2.5 Mass spectroscopy

An EASYnLC (Proxeon Biosystems), with an LTQ Orbitrap Classic mass spectrometer

(Thermo Scientific) and nano-electrospray ionisation source (Proxeon Biosystems) was used

to perform liquid chromatography-tandem mass spectrometry (LC-MS/MS). Mass

spectrometry analysis was carried out by Kieran Wynne in the Conway Institute Core Facility.

Peptides were separated on a 15cm reversed phase analytical column (75µm internal

diameter) in-house packed with 3 µm ReproSil-Pur C18AQ beads, using a 1hr gradient from

2-25% buffer B (99.9% MeCN/0.1% formic acid) at a flow rate of 0.4 µl/min.

Data was continuously collected in a data-dependent manner, with the Orbitrap collecting a

survey scan at 60,000 resolution with an automatic gain control (AGC) target of 1 x 106.

Collision induced dissociation (CID) M/MS scans followed, using the 10 most abundant ions

from the survey scan, with an AGC target of 5,000; signal threshold of 1,000; 2.0 Da isolation

width; and MS activation time at 35% normalised collision energy. Charge state screening

was used to reject unassigned or 1+ charge states. Dynamic exclusion was enabled to ignore

masses for 30 s that had previously selected for fragmentation.

2.4 Protein identification and quantification

The raw MS file was processed using MaxQuant on default settings (figure 2.10), with the

exception of carbamidomethyl as a fixed modification.

Figure 2.10: Default MaxQuant settings

21

3. Results

3.1 A proteomics approach to characterising the Ezh2 interactome

Figure 3.1 shows a schematic workflow of the approach taken in this project to provide a

clear and concise overview of the methods used and the reasoning behind each step. The

diagram shows how the initial Ezh2 immunoprecipitation was processed, through a

combination of lab based methods and bioinformatics, to provide the final output seen in table

3.1. As with all scientific experiments each step in the process had a specific aim:

1. Immunoprecipitation: Purification of the protein of interest (Ezh2), and interacting

proteins, from cell lysate.

2. SDS-PAGE: Separation of proteins by weight

3. In-gel Trypsin digestion: Removal of proteins from the gel and subsequent digestion

of proteins into small peptides for MS analysis.

4. Mass spectrometry: Detection and quantification of peptide ions.

5. Protein Mass Fingerprinting (MaxQuant analysis): Matching of peptide ions to known

and predicted MS/MS ion databases to generate a protein list.

6. Statistical Analysis (Perseus): Assignment of p-values to each protein hit to allow

separation of false positives from true interactors.

7. Bioinformatic analysis: Interpretation of data using various databases to generate

results and evidence for further research.

22

Figure 3.1: Workflow for analysis of the Ezh2 interactome:

1. A co-immunoprecipitation of Ezh2 from ES cell lysate was performed.

2. The protein samples were separated on an SDS-PAGE gel.

3. An in-gel trypsin digestion and peptide extraction was performed.

4. The peptide samples were analysed using LC-MS/MS.

The MS results were analysed using MaxQuant and peptide mass fingerprinting to obtain a list of proteins which

were subsequently used to form an interactome for Ezh2 using bioinformatics database analysis.

(SDS-PAGE apparatus image edited from; http://commons.wikimedia.org/wiki/File:SDS-PAGE_Electrophoresis.png)

(ES cell images taken from; http://en.wikipedia.org/wiki/File:Humanstemcell.JPG)

http://commons.wikimedia.org/wiki/File:SDS-PAGE_Electrophoresis.png

23

3.2 Statistical Analysis and Data Interpretation

The aim of the statistical analysis was to refine the Ezh2 true positive interactors from the

false positives through use of a student’s T-test.

The raw mass spectral files were initially analysed using MaxQuant, a quantitive proteomics

software designed for analysing large mass-spectrometric data sets [19][20]

. By using Perseus, a

statistical software package designed to perform downstream bioinformatics and statistics on

the MaxQuant output tables [20]

, it was possible to perform a student’s T-test to generate p-

values for each protein hit. Taking a confidence level of 95% to be adequate proteins with a p-

value of less than 0.5 were taken to be true interactors. Table 3.1 shows all protein hits with

P<0.05, to allow for easier interpretation of the total data set (found in appendix 1).

24

Gene Ezh2

Spectral

Control

Spectral

P-value Description

TITIN_MOUSE 382 0 1.22E-41 Plays a role in chromatin

condensation and segregation

GUF1_MOUSE 231 0 1.20E-25 Mitochondrial translation factor

PRP8_MOUSE 183 0 2.18E-20 Spliceosome component

SF3B1_MOUSE 150 0 9.82E-17 Splicing factor


DDX21_MOUSE 127 0 2.39E-14 RNA helicase

TOP2A_MOUSE 127 0 2.39E-14 DNA topoisomerase

SPT5H_MOUSE 126 0 4.16E-14 Transcription factor

SF3A2_MOUSE 106 0 3.88E-12 Splicing factor

SSRP1_MOUSE 101 0 1.56E-11 Component of the FACT complex

U5S1_MOUSE 98 0 4.20E-11 Component of the U5 snRNP

complex – pre-mRNA splicing

SMCA4_MOUSE 84 0 1.01E-09 (BRG1) Transcriptional co-activator

H2B1B_MOUSE 81 0 2.62E-09 Histone

H2B1C_MOUSE 81 0 2.62E-09 Histone

H2B1K_MOUSE 81 0 2.62E-09 Histone

DHX15_MOUSE 78 0 4.10E-09 Pre-mRNA splicing factor

K2C6B_MOUSE 78 0 4.10E-09 Keratin

H2A1K_MOUSE 75 0 1.02E-08 Histone

SUZ12_MOUSE 76 0 1.12E-08 Component of PRC2 complex

SMCA5_MOUSE 73 0 1.62E-08 Helicase – nucleosome remodelling

activity

SPT6H_MOUSE 73 0 1.62E-08 Elongation factor

EZH2_MOUSE 74 0 1.75E-08 Catalytic component of the PRC2

complex

ARI1A_MOUSE 72 0 2.72E-08 Chromatin Remodelling (nBAF)

SMG1_MOUSE 67 0 6.59E-08 Protein Kinase

H2AV_MOUSE 60 0 4.17E-07 Histone

LRP2_MOUSE 60 0 4.17E-07 LDL receptor related protein

ASH1L_MOUSE 61 0 4.44E-07 H3K36 methyltransferase

ASPM_MOUSE 61 0 4.44E-07 Mitotic spindle regulation

VP13C_MOUSE 61 0 4.44E-07 Vacuolar protein sorting-associated

protein 13C

EED_MOUSE 59 0 6.94E-07 Component of the PRC2 complex

CDC5L_MOUSE 56 0 1.06E-06 DNA binding protein involved in

cell cycle control


H2B3B_MOUSE 56 0 1.06E-06 Histone

EIF3A_MOUSE 57 0 1.09E-06 Translation initiation factor

H2AY_MOUSE 57 0 1.09E-06 Histone

SMRC1_MOUSE 57 0 1.09E-06 Chromatin remodelling –

(SWI/SNF, WINAC, nBAF)

H2B1A_MOUSE 55 0 1.71E-06 Histone

SACS_MOUSE 55 0 1.71E-06 HSP70 chaperone

CHD7_MOUSE 53 0 2.70E-06 Chromodomain-helicase-DNA

25

binding protein 7

RPB1_MOUSE 53 0 2.70E-06 DNA directed RNA polymerase II

subunit RPB1

CD158_MOUSE 52 0 4.68E-06 Coiled coil domain containing

protein 158

DDX3X_MOUSE 52 0 4.68E-06 RNA helicase

ACINU_MOUSE 49 0 6.77E-06 Apototic chromatin condensation

EH1L1_MOUSE 50 0 7.28E-06

RB6I2_MOUSE 50 0 7.28E-06 Regulatory subunit of IKK complex

SPTA2_MOUSE 50 0 7.28E-06 Cytoskeleton protein

CROCC_MOUSE 47 0 1.08E-05 Cytoskeleton protein

DYHC1_MOUSE 47 0 1.08E-05 Motor protein

RS9_MOUSE 47 0 1.08E-05 Ribosomal protein

ACTA_MOUSE 48 0 1.14E-05 Actin

FXR1_MOUSE 45 0 1.72E-05 Fragile X mental retardation

syndrome-related protein 1

MBB1A_MOUSE 45 0 1.72E-05 Transcriptional activator/repressor –

interacts with DNA binding proteins

TBA1A_MOUSE 45 0 1.72E-05 Tubulin

RIF1_MOUSE 43 0 2.75E-05 Required for checkpoint mediated

arrest of cell cycle progression

SMC1A_MOUSE 43 0 2.75E-05 Chromosome cohesion during cell

cycle and DNA repair

DYHC2_MOUSE 44 0 2.80E-05 Dynein

EWS_MOUSE 44 0 2.80E-05 RNA binding protein – may be

repressive

RS2_MOUSE 44 0 2.80E-05 Ribosomal protein

SMCA2_MOUSE 44 0 2.80E-05 Transcriptional activator

(nBAF/WINAC)

DDX3Y_MOUSE 42 0 4.41E-05 RNA helicase

MYH10_MOUSE 42 0 4.41E-05 Myosin

PB1_MOUSE 42 0 4.41E-05 Chromatin remodellor – regulator of

cell proliferation

TCRG1_MOUSE 42 0 4.41E-05 RNA polII inhibitor

CHD4_MOUSE 40 0 6.98E-05 Histone deacetylase – NuRD

complex

MATR3_MOUSE 41 0 7.72E-05 Matrin – transcription factor

SRRM2_MOUSE 41 0 7.72E-05 Pre-mRNA splicing

BAZ1B_MOUSE 38 0 0.000110695 Tyrosine protein kinase – chromatin

remodelling

PRP6_MOUSE 38 0 0.000110695 Splicing factor

PSIP1_MOUSE 39 0 0.000120205 Cell differentiation, stress response

RPB2_MOUSE 39 0 0.000120205 DNA driven RNA polII

AQR_MOUSE 36 0 0.00017612 Splicing factor

EGF_MOUSE 36 0 0.00017612 Epidermal growth factor

IF4A3_MOUSE 36 0 0.00017612 RNA helicase

RIMS2_MOUSE 36 0 0.00017612 Rab effector involved in exocytosis

SMCA1_MOUSE 37 0 0.000187766 Component of the NURF complex

26

FMN2_MOUSE 34 0 0.000281027 Central nervous system protein

RBM25_MOUSE 34 0 0.000281027 RNA binding protein – splicing

factor

BAHC1_MOUSE 35 0 0.000294268 Coiled coil domain protein

CJ018_MOUSE 35 0 0.000294268

H2A2B_MOUSE 35 0 0.000294268 Histone

TIF1B_MOUSE 35 0 0.000294268 Transcription repression

DDX46_MOUSE 32 0 0.000449671 Splicing factor

ERC2_MOUSE 32 0 0.000449671

K1C18_MOUSE 32 0 0.000449671 Keratin

SFRS1_MOUSE 32 0 0.000449671 Splicing factor

SMU1_MOUSE 32 0 0.000449671

RBBP7_MOUSE 33 0 0.000462695 Component of the PRC2 complex

TCOF_MOUSE 33 0 0.000462695 May play a role in embryonic

development

ERC6L_MOUSE 30 0 0.000721432 DNA helicase

ILF3_MOUSE 30 0 0.000721432 Interleukin enhancer factor

MGAP_MOUSE 30 0 0.000721432

VWF_MOUSE 30 0 0.000721432 Von Willebrand factor

CP250_MOUSE 31 0 0.000729902 Centrosome cohesion

DKC1_MOUSE 31 0 0.000729902

DNMT1_MOUSE 31 0 0.000729902 Demethylase

KDM5B_MOUSE 31 0 0.000729902 H3K4 demethylase

KTN1_MOUSE 31 0 0.000729902 Kinesin

MAN1_MOUSE 31 0 0.000729902 TGF-b antagonist

MDC1_MOUSE 31 0 0.000729902 DNA repair

PGBM_MOUSE 31 0 0.000729902



PRP19_MOUSE 38 2 0.006524444 DNA repair

NPM_MOUSE 65 4 0.000386053 Histone assembly, cell proliferation,

histone chaperone

GPC4_MOUSE 30 2 0.031020003 Protoglycan

RBBP4_MOUSE 43 3 0.0071263 Component of PRC2 complex

H31_MOUSE 69 5 0.000627128 Histone

SP16H_MOUSE 135 11 3.03E-06 FACT complex protein

RS14_MOUSE 54 5 0.00752452 Ribosomal protein

RS11_MOUSE 49 5 0.020706808 Ribosomal protein


FBRL_MOUSE 68 8 0.012257672 Pre-rRNA processing

FLNB_MOUSE 73 9 0.011268778 Cytoskeleton protein

NUCL_MOUSE 104 13 0.002483551 Chromatin decondensation

RS3A_MOUSE 72 9 0.015189771 Ribosomal protein

RS4X_MOUSE 70 9 0.020304002 Ribosomal protein Table 3.1: Mass spectrometry data set showing only hits with a significance of P<0.05. Count 1 refers to the

experimental sample while count 2 refers to the control (IgG) sample.

27

3.3 Bioinformatics Analysis

Analysis of the refined data set using UniProt (Universal Protein Resource), a comprehensive

resource for protein sequence and annotation data [21]

, suggested that the sample was enriched

in proteins associated with several core cellular activities including splicing, transcription and

chromatin remodelling.

Figure 3.2: Relative abundance of proteins associated with the PRC2 complex (dark blue), splicing (orange),

transcription (red) and chromatin remodelling (purple), according to UniProt database mining.

Processing of the refined data set using DAVID (The Database for Annotation, Visualization

and Integrated Discovery), a bioinformatics database which contains a set of functional

annotation tools useful for analysing large data sets [22][23]

, provided confirmation that the

abbreviated data set was enriched in the biological functions initially demonstrated by

UniProt database mining.

PRC2

Splicing Factors

Transcription Machinery

Chromatin Remodellors

Other

28

Figure 3.3: Functional Annotation analysis of the abbreviated data set (table 3.1), showing key enriched functions

(adapted from DAVID output table, appendix 2)

Isolation of smaller subsets of proteins was then carried out to allow for an in-depth analysis

of certain interest groups; primarily splicing, transcription, and chromatin remodelling as the

data set contained a high proportion of proteins associated with each of these functions (figure

3.2, figure 3.3). Each group was isolated and protein counts were graphed to give some

insight into the abundance of certain proteins that could be of interest.

STRING analysis was also performed on each subset to provide a visual representation of the

known interactome. STRING is a database of known and predicted protein interactions. The

interactions include direct (physical) and indirect (functional) associations; they are derived

from four sources; genomic context, high-throughput experiments, co-expression, and

previous knowledge (database mining) [24]

.

0 10 20 30 40 50 60 70 80 90 100

Methyltransferase

Methylation

Chromatin Regulator

RNA-binding

DNA-binding

Transcription Regulation

Spliceosome

Acetylation

Phosphoprotein

Nucleus

Proteins in Functional Annotation Category (%)

29

Figure 3.4: Expression levels of PRC2 proteins in the data set.

Figure 3.5: STRING network diagram of the EZH2 local interactome. The coloured lines refer to the type of evidence

in place to support the specific interaction.

01020304050607080

Pro

tein

Co

un

t

Ezh2 IP

Control

Gene Ezh2

Spectral

Control

Spectral

P-value Description

SUZ12_MOUSE 76 0 1.12E-08 Contains a zinc finger domain for

DNA/RNA binding.

EZH2_MOUSE 74 0 1.75E-08 The catalytic subunit of the PRC2

complex. SET domain has

methyltransferase activity.

EED_MOUSE 59 0 6.94E-07 Post-translational modification

binding

RBBP7_MOUSE 33 0 0.000462695 Histone binding

RBBP4_MOUSE 43 3 0.0071263 Histone binding

Table 3.2: PRC2 proteins isolated from the total data set.

30

Table 3.2 shows the PRC2 components detected in the data set, while figure 3.4 indicates the

relative abundance of each based on their protein count generated by MaxQuant. The

STRING network in figure 3.5 was generated by the programme itself using only Ezh2 as an

input; this is the known and predicted interactome of Ezh2 according to the STRING

database. However not all known protein interactions are present in the network which

suggests that STRING has some limitations and that networks generated by the programme

should be viewed with a critical eye.

Table 3.3 contains the protein hits with known, or predicted, splicing activity which were

detected in the Ezh2 pull-down sample. The protein counts have been graphed to give a visual

representation of the data (figure 3.6). Two proteins; SF3A1 and DDX5 have particularly high

counts, a UniProt database search for each of these proteins reveals that SF3A1 is a subunit of

a splicing factor (SF3A) which is responsible for assembly of the ‘A’ complex of the

spliceosome, while DDX5 is a regulator of pre-mRNA splicing in the cell. Both proteins seem

to have important activity in regards to spliceosome structure and function.

Functional annotation analysis of the subset (figure 3.7) shows enrichment in helicase

activity, RNA processing, and spliceosome proteins keeping in line with the original

assignment of these proteins to a splicing sub-group.

The STRING network for this sub-group (figure 3.8) shows that the majority of the proteins

are located in an interaction-rich core which represents the spliceosome and its associated

accessory proteins. Interaction between the Rbbp7 subunit of the PRC2 complex and Prpf19,

an intergral part of the spliceosome, indicates potential cross-talk between the two. However,

it should be noted though that Rbbp7 is known to belong to several other protein complexes,

including the chromatin remodelling complexes NuRD [25]

, and therefore interactions with it

do not necessarily infer interactions with the PRC2 complex.

31

Although due to the abundance of splicing proteins in the sample the Rbbp7 subunit of the

PRC2 complex may act as a bridge between Ezh2 and the spliceosome, conferring a

functional relationship between the two complexes.

Gene Ezh2

Spectral

Control

Spectral

P-value Description

AQR_MOUSE 36 0 0.00017612 Splicing factor

CDC5L_MOUSE 56 0 1.06E-06 Component of the PRP19-CDC5L

complex, integral part of the

spliceosome.

DDX46_MOUSE 32 0 0.000449671 Splicing factor

DHX15_MOUSE 78 0 4.10E-09 Putative pre-mRNA-splicing factor

ATP-dependent RNA helicase

HNRPM_MOUSE 38 24 0.003373485 Splicing factor

PRP19_MOUSE 38 2 0.006524444 DNA repair. Pre mRNA splicing –

associated with CDC5L

PRP6_MOUSE 38 0 0.000110695 Splicing factor

PRP8_MOUSE 183 0 2.18E-20 Central component of spliceosome

RBM25_MOUSE 34 0 0.000281027 RNA binding protein – splicing

factor






SFRS1_MOUSE 32 0 0.000449671 Splicing factor

SFRS7_MOUSE 60 48 3.40E-07 Splicing factor

SRRM2_MOUSE 41 0 7.72E-05 Involved in pre-mRNA splicing

U5S1_MOUSE 98 0 4.20E-11 Component of the U5 snRNP

complex required for pre-mRNA

splicing


RPB2_MOUSE 39 0 0.000120205 RNA helicase

DDX5_MOUSE 278 29 5.76E-09 Probable RNA helicase

DDX3X_MOUSE 52 0 4.68E-06 RNA helicase

DDX3Y_MOUSE 42 0 4.41E-05 Probable RNA helicase

DDX21_MOUSE 127 0 2.39E-14 Probable RNA helicase

DDX17_MOUSE 101 11 0.000858146 Probable RNA helicase

Table 3.3: Splicing associated proteins isolated from the total data set.

32

Figure 3.7: Functional Annotation analysis of splicing proteins

0

100

200

300

400

500

600

AQ

R_

MO

US

E

CD

C5

L_

MO

US

E

DD

X46

_M

OU

SE

DH

X15

_M

OU

SE

HN

RP

M_

MO

US

E

PR

P1

9_M

OU

SE

PR

P6

_M

OU

SE

PR

P8

_M

OU

SE

RB

M2

5_M

OU

SE

SF

3A

1_

MO

US

E

SF

3A

2_

MO

US

E

SF

3A

3_

MO

US

E

SF

3B

1_M

OU

SE

SF

3B

3_M

OU

SE

SF

RS

1_

MO

US

E

SF

RS

7_

MO

US

E

SR

RM

2_

MO

US

E

U5

S1

_M

OU

SE

DD

X42

_M

OU

SE

DD

X5_

MO

US

E

DD

X21

_M

OU

SE

DD

X3X

_M

OU

SE

DD

X3Y

_M

OU

SE

DD

X17

_M

OU

SE

Pro

tein

Co

un

t

Ezh2 IP

Control

0 10 20 30 40 50 60 70 80 90 100

mRNA splicing

Spliceosome

RNA processing

RNA binding

Helicase


Figure 3.6: Protein count levels for the isolated splicing associated proteins.

33

Figure 3.8: STRING network diagram of the detected splicing proteins and the PRC2 complex (circled).

34

Figure 3.9: Expression levels of transcription associated proteins in the data set.

0

20

40

60

80

100

120

140

Pro

tein

Co

un

t

Ezh2 IP

Control

Gene Ezh2

Spectral

Control

Spectral

P-value Description

RPB1_MOUSE 53 0 2.70E-06 DNA-directed RNA polymerase

II subunit RPB1

SPT5H_MOUSE 126 0 4.16E-14 Transcription factor

CHD7_MOUSE 53 0 2.70E-06 Chromodomain helicase DNA

binding protein 7

MBB1A_MOUSE 45 0 1.72E-05 May activate or repress

transcription (HDAC activity)

TCRG1_MOUSE 42 0 4.41E-05 Possible Transcription factor

MATR3_MOUSE 41 0 7.72E-05 Matrin 3

PSIP1_MOUSE 39 0 0.000120205 Transcriptional Co-activator

RPB2_MOUSE 39 0 0.000120205 DNA driven RNA pol II

TIF1B_MOUSE 35 0 0.000294268 DNA helicase

ERC6L_MOUSE 30 0 0.000721432 DNA driven RNA pol II

Table 3.4: Transcription associated proteins isolated from the data set.

35

Figure 3.10: Functional Annotation Analysis of transcription associated proteins

Figure 3.11: STRING network diagram of the transcription associated proteins and the PRC2 complex.

0 10 20 30 40 50 60 70 80 90 100

Acetylation

DNA binding

Transcription

Phosphoprotein

RNA pol II, core complex

Regulation of Transcription


36

Table 3.4 indicates the proteins directly associated with transcription machinery (as opposed

to epigenetic transcriptional regulators which can be found in the chromatin remodelling sub-

group) and, as per previous sub-groups, figure 3.9 indicates the protein counts of each hit.

One protein in particular, Spt5h, a regulator of transcriptional elongation by RNA pol II has a

high protein count and the highest p-value of the listed proteins suggesting that it is definitely

present in the experimental sample.

Further analysis of the data set showed that two components of DNA-directed RNA

polymerase II were detected in the sample; RPB1 (Polr1a) and RPB2 (Polr2a), as well as

Spt5h (Supt5h), and the transcription elongation factor TCRG1 (Tcerg1) all of which are

directly involved in the RNA elongation during transcription. According to the STRING

network (figure 3.11) these proteins may be interacting with the PRC2 complex through the

matrin 3 protein, whose function has yet to be fully determined but is thought to play a role in

transcription. These interactions may provide an insight into alternative methods of

transcriptional silencing exerted by the PRC2 complex.

37

Figure 3.12: Expression levels of chromatin remodelling associated proteins in the data set.

0

20

40

60

80

100

120

140

Pro

tein

Co

un

t

Ezh2 IP

Control

Gene Ezh2

Spectral

Control

Spectral

P-value Description

SMCA4_MOUSE 84 0 1.01E-09 Transcriptional activator BRG-1

SMCA5_MOUSE 73 0 1.62E-08 Helicase with ATP-dependent

nucleosome-remodeling activity

ARI1A_MOUSE 72 0 2.72E-08 Component of the nBAF remodelling

complex

PB1_MOUSE 42 0 4.41E-05 Transcriptional activator/repressor.

Negative regulator of cell proliferation

BAZ1B_MOUSE 38 0 0.00011069

5

Tyrosine-protein kinase

SMRC1_MOUSE 57 0 1.09E-06 SWI/SNF component

SMCA2_MOUSE 44 0 2.80E-05 Component of the SWI/SNF, WINAC

and nBAF complexes

SP16H_MOUSE 135 11 3.03E-06 Transcriptional activator.

nBAF/WINAC component

NUCL_MOUSE 104 13 0.00248355

1

FACT complex component, histone

chaperone

SSRP1_MOUSE 101 0 1.56E-11 FACT complex component

Table 3.5: Chromatin remodelling proteins isolated from the total data set.

38

Figure 3.13: Functional Annotation analysis of the isolated chromatin remodelling proteins.

Figure 3.14: STRING network diagram of chromatin remodelling proteins and the PRC2 complex.

0 10 20 30 40 50 60 70 80 90 100

Chromatin Remodeling Complex

SWI/SNF complex

Transcription Regulation

Chromatin Assembly/Disassembly


39

As with previous sub-groups the initial tables and figures (table 3.5, figure 3.12) refer to the

isolated proteins and show the relative abundance of each protein. Figure 3.13 shows the

functional annotation of the sub-group proteins, indicating enrichment in transcriptional

regulation activity and chromatin remodelling proteins.

The chromatin remodelling components detected are primarily associated with the FACT

(facilitates chromatin transcription) and SWI/SNF complexes, with SUPT16H and SSRP1

being the two main protein subunits of the FACT complex [26]

and many of the other proteins

being SWI/SNF related. The FACT complex is associated with active transcription, and

therefore interactions with Ezh2 and the PRC2 complex may be inhibitory to further the effect

of transcriptional silencing exerted by the PRC2 complex.

Once again the proteins seem to be interacting with PRC2 through the Rbbp7 subunit (figure

3.14) which, as previously mentioned, is a transient interactor of PRC2 and several other

protein complexes.

Figure 3.15 shows a STRING network combining the three sub-groups (splicing,

transcription, chromatin remodelling) and PRC2 to give a visual overview of the known and

potential interactions between all four groups. The resulting map shows many interactions

between components of all four groups, which provides good evidence for functional

associations between Ezh2 (PRC2) and each of these groups. However further experiments

would be required to prove and characterise these interactions and their downstream effects.

40

Figure 3.15: STRING network diagram of all three subgroups (splicing, chromatin remodelling, and transcription)

and the PRC2 complex. Thicker lines indicate a stronger confidence in the interaction.

41

4. Discussion

4.1 The current Ezh2 interactome

Several key interactions involving Ezh2 have already been characterised including that with

PHF1 [15]

, cell cycle regulators [28]

, and DNA methyltransferases (DNMTs) [29]

. It’s clear from

these studies that Ezh2 and the PRC2 complex have an intricate web of interactors involved in

the maintenance of pluripotency and cell differentiation. Due to the importance of Ezh2 and

PRC2 for regulation of these processes any dysfunction in their activity can have severe

downstream consequences, primarily tumourigenesis [4][5][6]

but mutations in Ezh2 are also

known to be the cause of Weaver Syndrome, a congenital disease associated with general

overgrowth starting in the pre-natal stage [31]

. The association of mutated Ezh2 with cancer

has been extensively studied, including therapeutic strategies involving inhibition of Ezh2 in

cases of over-expression [32][33]

. Ezh2 has also been suggested as a potential biomarker for

certain cancers [35][36]

, however high levels of Ezh2 are associated with poor prognosis and

late stage cancers which may undermine the effectiveness of the protein as a biomarker. The

importance of Ezh2 for stem cell health makes characterisation of its interactions and

downstream effects vastly important in understanding the mechanisms by which stem cells

are regulated.

4.2 Large Scale Proteomic Analysis of Protein-Protein Interactions

Large scale proteomic approaches have proved to be extremely useful in the past; however it

is important to understand the limitations of such approaches in order to critically assess the

results obtained from them. One of the prime concerns in any large scale proteomics project is

the complexity of the proteome itself, post-translational modifications and alternative splicing

42

can cause major differences in the protein content and activity between cells. What may be

true for one cell type under certain conditions may not be true for most other cell types. It is

for this reason that large scale analysis, like that carried out in the experiment, can be used to

detect global motifs and functional enrichments in a protein population but must be followed

up by smaller, more specific experimental analysis of select subsets of proteins. Western blot

analysis, yeast two-hybrid methods and fluorescent resonance energy transfer (FRET)

experiments are three potential follow up experiments which can be used to clarify protein-

protein interactions. Once an interaction has been clarified it’s then possible to proceed with

experiments to determine the type of interactions involved and the effects of said interactions.

RNAi technology, conditional knock-outs, and generation of mutant animal models are all

useful in determining the effect of one protein-protein interactions, by interfering or removing

a protein it’s possible to observe its effect on other proteins through alterations in phenotype,

proteomic composition, gene expression etc.

4.3 Global analysis of the Ezh2 interactome data

Global analysis of the data set shows a vast interactome, with several interaction of interest. A

complex network of interactions seems to be occurring between transcription factors, splicing

proteins and chromatin remodelling proteins (figure 3.15), which may point towards

mechanisms of genomic regulation not yet characterised in regards the PRC2 complex.

The PRC2 complex already induces downstream chromatin remodelling though its interaction

with PRC1, therefore it’s possible that alternate interactions with other chromatin remodelling

complexes may also be used to induce chromatin condensation and prevent transcription. The

chromatin remodelling complex FACT consists of two subunits; Supt16 and SSRP1 [26]

, both

of which were detected in the data set with high confidence and in high abundance (table 3.5).

43

The FACT complex is known to affect RNA polymerase II, the key protein involved in

transcription, by removing a H2A-H2B histone dimer to allow RNA pol II to access the

desired regions of DNA contained within the nucleosome [36]

. It has been shown that the

H2AK119ub1 mark induced by the PRC1 complex prevents the binding of the FACT

complex to target genes and inhibits transcription [30]

; therefore the PRC2 complex has

downstream effects on the FACT complex. However this doesn’t explain why both FACT

subunits; SPT16 and SSRP1, are present in the data set.

Two subunits of the RNA pol II complex (RPB1 (Polr1a) and RPB2 (Polr2a)), as well as two

transcription elongation factors (TCRG1 (Tcerg1) and Spt5h (Supt5h)) were found in the data

set providing evidence for interplay between PRC2, FACT, and RNA pol II. However the low

number of transcription factors in the data set wouldn’t provide enough evidence to support

further research into the area. In light of the known interactions between FACT and RNA pol

II it is possible that PRC2 has an indirect effect on RNA pol II through FACT inhibition. It’s

also possible that PRC2 interacts with RNA pol II to receive a strand of ncRNA which it uses

to target specific genes [17]

, although a definite function for the interaction between PRC2 and

ncRNA is still uncharacterised.

4.3 Splicing and the PRC2 complex

It’s clear from the results that a large amount of splicing factors were contained in the sample,

primarily belonging to the spliceosome or DEAD-box containing RNA helicase family (DDX

proteins). This would suggest a functional relationship between the spliceosome and the

PRC2 complex, however there are no characterised interactions between the two found in the

literature. The spliceosome is essential for modification of transcribed RNA into mRNA

through the excision of introns (non-coding regions) and ligation of the remaining exons

44

(coding regions) into a translatable mRNA construct. Without the removal of these introns the

transcribed RNA cannot be translated into a protein and therefore a gene will have no protein

product expressed, and therefore no cellular effect. If a relationship between the spliceosome

and the PRC2 complex does exist it would undoubtedly be quite complex, and it is currently

unclear as to whether the two share an antagonistic relationship but it is possible that the

PRC2 complex inhibits the activity of the spliceosome preventing processing of pre-mRNA

into translatable mRNA, thus ultimately suppressing gene expression.

In order for this to be a viable avenue for further research it would first be necessary to

provide additional evidence to support the idea that PRC2 and the spliceosome are indeed

interacting. Subsequent pull-downs paired with Western blot analysis would be the simplest

way to achieve this. By probing for both Ezh2 and a core component of the spliceosome, such

as PRP9, PRP18, or CDC5L, it would be possible to show if the two are pulled down together

in subsequent co-immunoprecipitation experiments and therefore warrant further

investigation.

If the Western analysis was successful and indeed showed that PRC2 and the spliceosome are

pulled down together it would then be necessary to establish how the two interact and to what

effect. Knock-down of Ezh2 using RNAi technology would prevent the formation of the

PRC2 complex as Ezh2 is essential for PRC2 stability, if the two do interact this will have an

effect on splicing within the cell. The splicing activity in a cell can be measured via mRNA

levels using next generation sequencing methods such as Illumina; however it would be

difficult to determine whether the change in mRNA levels is due to increased spliceosome

activity in the absence of PRC2 or purely because transcription of certain genes is no longer

repressed by PRC2.

45

A potential solution to this problem would be the use of the Reverse Transcriptase-

Polymerase Chain Reaction (RT-PCR) methods, which would amplify the levels of specific

pre-mRNA and mRNA in a sample. If the PRC2 complex was inhibiting spliceosome activity

on certain genes we would see more of the pre-mRNA from those genes present, as opposed

to the processed mRNA product. Upon knock-down of Ezh2 the levels of pre-mRNA and

mRNA should shift if the spliceosome in that region is no longer being inhibited. A schematic

overview of the proposed experiments can be seen in figure 4.1 and figure 4.2.

Figure 4.1: Active inhibition of the spliceosome by PRC2 would result in higher levels of pre-mRNA compared to

processed mRNA, which can be detected through a combination of RT-PCR and agarose gel electrophoresis.

46

Figure 4.2: No inhibition of the spliceosome by PRC2 would result in an increase in processed mRNA levels compared

to pre-mRNA

Due to the size difference between pre-mRNA and processed mRNA it is possible to detect

alterations in pre-mRNA processing using agarose gel electrophoresis to separate the

molecules based on their size. Processed mRNA is generally shorted than its pre-mRNA

precursor due to the excision of introns by the spliceosome, and therefore travels further in an

electrophoresis gel experiment. These properties can be exploited for the sake of the proposed

experiments in order to detect changes in spliceosomal activity in PRC2+ and PRC2

-

environments.

47

4.4 Conclusion

Overall it is clear from this project, and from the literature, that the interactome of Ezh2 and

the Polycomb proteins is diverse and complicated. The effects of Ezh2 and PRC2 on gene

repression has been well characterised, as well as some of the molecular mechanisms

involved. However further research is required to understand the full extent of PRC2s effect

on transcription and gene expression.

Further research into this field could provide new and alternative therapeutic targets for

cancer treatment, and potentially refine our understanding of stem cell mechanics with

downstream effects on the use of stem cells in regenerative medicine.

5. Acknowledgements

I’d first like the thank Gerard Cagney for the opportunity to work with him and his team on

this project, and for all his help and advice along the way. I’d also like to convey my

appreciation to the members of Dr. Cagney’s group for their support and helpful advice

throughout this project; a big thank you to Dr. Aisling Robinson for teaching me the protocols

and answering all my questions, and to Giorgio Oliviero, Ariane Watson, Guillermo

Gambero, and Nayla Munawar for all their guidance and advice. Lastly I’d like to thank Dr.

Adrian Bracken (TCD) and his group for providing me with Ezh2 immunoprecipitation

samples, without which this project would not have been possible.

Word Count: 6892

48

6. Bibliography

1. Takahashi, K., Yamanaka, S. (2006) ‘Induction of Pluripotent Stem Cells from Mouse

Embryonic and Adult Fibroblast Cultures by Defined Factors’. Cell 126: 663-676

2. Boyer, L., Plath, K. et al. (2006) ‘Polycomb Complexes Repress Developmental

Regulators in Murine Embryonic Stem Cells’. Nature 441: 349-353

3. Sauvageau, M., Sauvageau, G. (2010). ‘Polycomb Group Proteins: Multi-Faceted

Regulators of Somatic Stem Cells and Cancer’. Cell Stem Cell. 7(3):299-313.

4. Simon, J.A, Lange, C.A. (2008) ‘Roles of the EZH2 histone methyltransferase in cancer

epigenetics’. Mutation Research/Fundamental and Molecular Mechanisms of

Mutagenesis 647: 21-29.

5. Chang, C-J., Hung, M-C. (2011) ‘The Role of EZH2 in Tumour Progression’. British

Journal of Cancer. 106:243-247.

6. Chase, A., Cross, N.C.P. (2011) ‘Aberrations of EZH2 in Cancer’. Clinical Cancer

Research. 17:2613-2618.

7. Luger, K., Richmond, T.J. (1998) ‘The Histone Tails of the Nucleosome’. Current

Opinion in Genetics and Development. 8:140-146.

8. Strahl, B.D., Allis, C.D. (2000) ‘The Language of Covalent Histone Modifications’.

Nature. 403:41-45.

9. Vastenhouw, N., Schier, A. (2012) ‘Bivalent Histone Modifications in Early

Embryogenesis’. Current Opinion in Cell Biology. 24:374-386.

10. Ekwall K., Olsson T., Turner BM., Cranston G., Allshire RC. ( 1997). ‘Transient

inhibition of histone deacetylation alters the structural and functional imprint at fission

yeast centromeres’. Cell. 91:1021–1032.

49

11. VerMilyea, M., O’Neill, L., Turner, B. (2009). ‘Transcription-Independent Heritability of

Induced Histone Modifications in the Mouse Preimplantation Embryo’. PLoS One. 6(4)

12. Faust, C., Schumacher, A., Holdener, B., Maguson, T. (1995) ‘The EED mutation disrupts

anterior mesoderm production in mice’. Development. 121: 273-285.

13. O’Carroll, D., Erhardt, S., Pagini, M., Barton, S.C., Surani, M.A, Jenuwein, T. (2001).

‘The polycomb-group gene EZH2 is required for early mouse development’. Mol. Cell

Biol. 21: 4330-4336.

14. Pasini, D., Bracken, A.P., Jensen, M.R., Lazzerini Denchi, E., Helin, K. (2004). ‘Suz12 is

essential for mouse development and Ezh2 methyltransferase activity’. EMBO J. 23:

4061-4071.

15. Sarma, K., Margueron, R., Ivanov, A., Pirrotta, V., Reinberg, D. (2008). ‘Ezh2 requires

PHF1 to efficiently catalyse H3 lysine 27 trimethylation in vivo’. Mol. Cell Biol. 28(8):

2718-2731

16. Pasini, D., Cloos, P.A., Walfridsson, J. et al. (2010) ‘JARID2 Regulates Binding of the

Polycomb Repressive Complex 2 to Target Genes in ES Cells’. Nature 464: 306-310

17. Kanhere, A., Viiri, K., Araújo, C.C. et al. (2010) ‘Short RNAs are Transcribed from

Repressed Polycomb Target Genes and Interact with Polycomb Repressive Complex-2’.

Molecular Cell 38(5): 675-688

18. Margueron, R., Reinberg, D. (2011) ‘The Polycomb complex PRC2 and its mark in life’.

Nature 469: 343-349

19. Cox, J. & Mann, M. (2008) ‘MaxQuant enables high peptide identification rates,

individualized p.p.b.-range mass accuracies and proteome-wide protein quantification’.

Nature Biotechnol. 26: 1367–1372.

20. MaxQuant. Available at: http://www.Maxquant.org/index.htm [accessed 08/02/13]

50

21. UniProt (Universal Protein Resource). Available at: http://www.uniprot.org/help/about

[accessed 10/02/13].

22. Huang DW, Sherman BT, Lempicki RA. (2009) ‘Systematic and integrative analysis of

large gene lists using DAVID Bioinformatics Resources’. Nature Protoc. 4(1):44-57.

23. Huang DW, Sherman BT, Lempicki RA. (2009) ‘Bioinformatics enrichment tools: paths

toward the comprehensive functional analysis of large gene lists’. Nucleic Acids Res.

37(1):1-13.

24. STRING. Available at: http://www.string-db.org [accessed: 18/02/13]

25. Zhang, Y., Ng, H-H, et al. (1999) ‘Analysis of the NuRD subunits reveals a histone

deacetylase core complex and a connection with DNA metylation’. Genes and

Development 13: 1924-1935.

26. Orphanides, G., Wu, W.H., Lane, W.S., Hampsey, M., Reinberg, D. (1999) ‘The

chromatin-specific transcription elongation factor FACT comprises human SPT16 and

SSRP1 proteins’. Nature 400(6741): 284-288.

27. Allis, C.D. (Ed.), Jenuwein, T. (Ed.), Reinberg, D. (Ed.), Caparros, M.L. (Ed.) (2007)

‘Epigenetics’. New York: Cold Spring Harbour Laboratory Press

28. Kaneko, S., Li, G., Son, J. et al (2010) ‘Phosphorylation of the PRC2 component Ezh2 is

cell cycle-regulated and up-regulates its binding to ncRNA’. Genes & Development 24:

2615-2620.

29. Viré, E., Brenner, C., Deplus, R. et al. (2006) ‘The Polycomb Group Protein EZH2

Directly Controls DNA Methylation’. Nature 439(7078): 871-874.

30. Zhou, W., Zhu, P., Wang, J., Pascual, G. et al. (2008) ‘Histone H2A Monoubiquitination

Represses Transcription by Inhibiting RNA Polymerase II Transcriptional Elongation’.

Mol. Cell 29(1): 69-80.

http://www.string-db.org/

51

31. Gibson, W.T., Hood, R., Zhan, S.H. et al. (2012) ‘Mutations in Ezh2 Cause Weaver

Syndrome’. American Journal of Human Genetics 90(1): 110-118.

32. Kikuchi J. et al. (2012) ‘Epigenetic Therapy with 3-deazaneplanocin A, an Inhibitor of the

Histone Methyltransferase EZH2, Inhibits Growth of Non-Small Cell Lung Cancer Cells’.

Lung Cancer. 78(2):138-143.

33. Crea, F., Fornaro, L., Bocci, G. et al. (2012) ‘EZH2 inhibition: targeting the crossroad of

tumor invasion and angiogenesis’. Cancer Metastasis Review 31(3-4): 753-761.

34. Hajósi-Kalcakosz, S., Dezsó, K., Bodor, S. et al (2012) ‘Enhancer of zeste homologue 2

(Ezh2) is a reliable immunohistochemical marker to differentiate malignant and benign

hepatic tumors’. Diagnostic Pathology 7: 86.

35. Cai, M., Tong, Z., Zheng, F. et al. (2011) ‘EZH2 protein: a promising immunomarker for

the detection of hepatocellular carcinomas in liver needle biopsies’ Gut 60: 967-976.

36. Reinberg, D., Sims, R.J. (2006) ‘deFACTo Nucleosome Dynamics’. The Journal of

Biological Chemistry 281: 23297-23301.

52

7. Appendices

Appendix 1

Gene Ezh2

Spectral

Control

Spectral

P-value Odds Ratio

TITIN_MOUSE 382 0 1.22E-41 NA

GUF1_MOUSE 231 0 1.20E-25 NA

PRP8_MOUSE 183 0 2.18E-20 NA

SF3B1_MOUSE 150 0 9.82E-17 NA

SF3B3_MOUSE 142 0 5.90E-16 NA

DDX21_MOUSE 127 0 2.39E-14 NA

TOP2A_MOUSE 127 0 2.39E-14 NA

SPT5H_MOUSE 126 0 4.16E-14 NA

SF3A2_MOUSE 106 0 3.88E-12 NA

SSRP1_MOUSE 101 0 1.56E-11 NA

U5S1_MOUSE 98 0 4.20E-11 NA

SMCA4_MOUSE 84 0 1.01E-09 NA

H2B1B_MOUSE 81 0 2.62E-09 NA

H2B1C_MOUSE 81 0 2.62E-09 NA

H2B1K_MOUSE 81 0 2.62E-09 NA

DHX15_MOUSE 78 0 4.10E-09 NA

K2C6B_MOUSE 78 0 4.10E-09 NA

H2A1K_MOUSE 75 0 1.02E-08 NA

SUZ12_MOUSE 76 0 1.12E-08 NA


SPT6H_MOUSE 73 0 1.62E-08 NA

EZH2_MOUSE 74 0 1.75E-08 NA

ARI1A_MOUSE 72 0 2.72E-08 NA

SMG1_MOUSE 67 0 6.59E-08 NA

H2AV_MOUSE 60 0 4.17E-07 NA

LRP2_MOUSE 60 0 4.17E-07 NA

ASH1L_MOUSE 61 0 4.44E-07 NA

ASPM_MOUSE 61 0 4.44E-07 NA

VP13C_MOUSE 61 0 4.44E-07 NA

EED_MOUSE 59 0 6.94E-07 NA

CDC5L_MOUSE 56 0 1.06E-06 NA

DDX42_MOUSE 56 0 1.06E-06 NA

H2B3B_MOUSE 56 0 1.06E-06 NA

EIF3A_MOUSE 57 0 1.09E-06 NA

H2AY_MOUSE 57 0 1.09E-06 NA

SMRC1_MOUSE 57 0 1.09E-06 NA

H2B1A_MOUSE 55 0 1.71E-06 NA

SACS_MOUSE 55 0 1.71E-06 NA

53

CHD7_MOUSE 53 0 2.70E-06 NA

RPB1_MOUSE 53 0 2.70E-06 NA

CD158_MOUSE 52 0 4.68E-06 NA

DDX3X_MOUSE 52 0 4.68E-06 NA

ACINU_MOUSE 49 0 6.77E-06 NA

EH1L1_MOUSE 50 0 7.28E-06 NA

RB6I2_MOUSE 50 0 7.28E-06 NA

SPTA2_MOUSE 50 0 7.28E-06 NA

CROCC_MOUSE 47 0 1.08E-05 NA

DYHC1_MOUSE 47 0 1.08E-05 NA

RS9_MOUSE 47 0 1.08E-05 NA

ACTA_MOUSE 48 0 1.14E-05 NA

FXR1_MOUSE 45 0 1.72E-05 NA

MBB1A_MOUSE 45 0 1.72E-05 NA

TBA1A_MOUSE 45 0 1.72E-05 NA

RIF1_MOUSE 43 0 2.75E-05 NA

SMC1A_MOUSE 43 0 2.75E-05 NA

DYHC2_MOUSE 44 0 2.80E-05 NA

EWS_MOUSE 44 0 2.80E-05 NA

RS2_MOUSE 44 0 2.80E-05 NA


DDX3Y_MOUSE 42 0 4.41E-05 NA

MYH10_MOUSE 42 0 4.41E-05 NA

PB1_MOUSE 42 0 4.41E-05 NA

TCRG1_MOUSE 42 0 4.41E-05 NA

CHD4_MOUSE 40 0 6.98E-05 NA

MATR3_MOUSE 41 0 7.72E-05 NA

SRRM2_MOUSE 41 0 7.72E-05 NA

BAZ1B_MOUSE 38 0 0.000110695 NA

PRP6_MOUSE 38 0 0.000110695 NA

PSIP1_MOUSE 39 0 0.000120205 NA

RPB2_MOUSE 39 0 0.000120205 NA

AQR_MOUSE 36 0 0.00017612 NA

EGF_MOUSE 36 0 0.00017612 NA

IF4A3_MOUSE 36 0 0.00017612 NA

RIMS2_MOUSE 36 0 0.00017612 NA

SMCA1_MOUSE 37 0 0.000187766 NA

FMN2_MOUSE 34 0 0.000281027 NA

RBM25_MOUSE 34 0 0.000281027 NA

BAHC1_MOUSE 35 0 0.000294268 NA

CJ018_MOUSE 35 0 0.000294268 NA

H2A2B_MOUSE 35 0 0.000294268 NA

TIF1B_MOUSE 35 0 0.000294268 NA

DDX46_MOUSE 32 0 0.000449671 NA

ERC2_MOUSE 32 0 0.000449671 NA

K1C18_MOUSE 32 0 0.000449671 NA

54

SFRS1_MOUSE 32 0 0.000449671 NA

SMU1_MOUSE 32 0 0.000449671 NA

RBBP7_MOUSE 33 0 0.000462695 NA

TCOF_MOUSE 33 0 0.000462695 NA

ERC6L_MOUSE 30 0 0.000721432 NA

ILF3_MOUSE 30 0 0.000721432 NA

MGAP_MOUSE 30 0 0.000721432 NA

VWF_MOUSE 30 0 0.000721432 NA

CP250_MOUSE 31 0 0.000729902 NA

DKC1_MOUSE 31 0 0.000729902 NA

DNMT1_MOUSE 31 0 0.000729902 NA

KDM5B_MOUSE 31 0 0.000729902 NA

KTN1_MOUSE 31 0 0.000729902 NA

MAN1_MOUSE 31 0 0.000729902 NA

MDC1_MOUSE 31 0 0.000729902 NA

PGBM_MOUSE 31 0 0.000729902 NA

SF3A1_MOUSE 554 12 1.26E-43 13.22822078

SF3A3_MOUSE 195 10 1.18E-11 5.587371232

PRP19_MOUSE 38 2 0.006524444 5.444105303

NPM_MOUSE 65 4 0.000386053 4.656142694

GPC4_MOUSE 30 2 0.031020003 4.297977871

RBBP4_MOUSE 43 3 0.0071263 4.106956632

H31_MOUSE 69 5 0.000627128 3.954139641

SP16H_MOUSE 135 11 3.03E-06 3.516527349

RS14_MOUSE 54 5 0.00752452 3.094544067

RS11_MOUSE 49 5 0.020706808 2.808012209

DDX5_MOUSE 278 29 5.76E-09 2.746753674

IMB1_MOUSE 37 4 0.059257196 2.650419687

DDX17_MOUSE 101 11 0.00085814594989 2.630883424

FBRL_MOUSE 68 8 0.012257672 2.435520794

FLNB_MOUSE 73 9 0.011268778 2.324091738

NUCL_MOUSE 104 13 0.002483551 2.292254865

RS3A_MOUSE 72 9 0.015189771 2.292254865

RS4X_MOUSE 70 9 0.020304002 2.228581118

RS8_MOUSE 44 6 0.089011007 2.101233626

GBLP_MOUSE 65 10 0.069847922 1.862457077

RS18_MOUSE 57 9 0.102930169 1.814701768

HNRPF_MOUSE 35 6 0.345661769 1.671435839

VIGLN_MOUSE 81 14 0.083375425 1.657791465

RS15A_MOUSE 46 8 0.250104603 1.647558184

DHX9_MOUSE 82 15 0.112667001 1.566374157

ANXA1_MOUSE 38 7 0.36919869 1.555458658

RS3_MOUSE 105 21 0.161310175 1.43265929

RU2A_MOUSE 30 6 0.548116498 1.43265929

HNRH1_MOUSE 34 7 0.572424102 1.391726168

H2AX_MOUSE 58 12 0.387046223 1.384903981

55

ROA3_MOUSE 42 9 0.502837955 1.337148671

TOP2B_MOUSE 51 11 0.447403643 1.328465887

RSSA_MOUSE 74 16 0.373511099 1.325209844

FLNC_MOUSE 35 8 0.713823077 1.253576879

RSMB_MOUSE 38 9 0.726594357 1.209801179

RSMN_MOUSE 38 9 0.726594357 1.209801179

LAMA5_MOUSE 37 9 0.858980928 1.177964305

H4_MOUSE 102 26 0.670001306 1.12408652

PUF60_MOUSE 35 9 0.857949026 1.114290559

SFPQ_MOUSE 35 9 0.857949026 1.114290559

ANXA6_MOUSE 34 9 1 1.082453686

RS16_MOUSE 33 9 1 1.050616813

FLNA_MOUSE 91 25 0.911237699 1.042975963

SMD2_MOUSE 40 11 1 1.041934029

H2A2A_MOUSE 73 21 1 0.996039316

DDX3L_MOUSE 48 14 1 0.982394942

HNRPU_MOUSE 91 27 0.911565935 0.965718485

LMNB1_MOUSE 76 23 0.808907924 0.946800922

H2B2B_MOUSE 82 25 0.815625424 0.939824494

MYH14_MOUSE 58 18 0.782344563 0.92326932

HNRPC_MOUSE 32 10 0.85256351 0.916901946

HNRPD_MOUSE 30 10 0.703513746 0.859595574

UBIQ_MOUSE 30 10 0.703513746 0.859595574

HSP7C_MOUSE 86 29 0.432880647 0.849715165

CLH_MOUSE 30 11 0.456280468 0.781450522

XPO2_MOUSE 30 11 0.456280468 0.781450522

GCAA_MOUSE 119 46 0.090772873 0.741245459

HNRPK_MOUSE 43 17 0.276253283 0.724757053

H2B2E_MOUSE 58 23 0.182766355 0.722558599

HSP72_MOUSE 40 16 0.261187524 0.716329645

COQ6_MOUSE 30 12 0.352429381 0.716329645

PLEC1_MOUSE 102 41 0.069881298 0.712835354

H12_MOUSE 35 17 0.093012218 0.589918531

H13_MOUSE 35 17 0.093012218 0.589918531

TBA1B_MOUSE 45 23 0.028046307 0.560605809

ARHG1_MOUSE 33 17 0.059577029 0.556208901

DPOLZ_MOUSE 30 17 0.033260467 0.505644455

LMNA_MOUSE 49 28 0.005541794 0.501430752

SPTB2_MOUSE 43 25 0.007837104 0.492834796

MYH11_MOUSE 59 36 0.000721125 0.469593879

ANXA2_MOUSE 39 24 0.005630124 0.465614269

HNRPM_MOUSE 38 24 0.003373485 0.453675442

IGKC_MOUSE 81 53 8.88E-06 0.437907179

TBB5_MOUSE 36 25 0.00103211 0.412605876

SFRS7_MOUSE 60 48 3.40E-07 0.358164823

EPIPL_MOUSE 40 33 1.46E-05 0.347311343

56

TERA_MOUSE 33 28 4.26E-05 0.337698261

CCD25_MOUSE 228 199 1.12E-27 0.328287757

MYH9_MOUSE 41 36 2.81E-06 0.326327949

K2C8_MOUSE 47 42 2.61E-07 0.320642794

EF1A1_MOUSE 35 33 2.48E-06 0.303897425

ACTG_MOUSE 71 67 1.89E-11 0.303638238

ACTB_MOUSE 70 67 1.57E-11 0.299361643

K1C19_MOUSE 61 64 2.40E-12 0.273100677

K2C72_MOUSE 34 36 1.19E-07 0.270613422

ACTBL_MOUSE 33 36 8.61E-08 0.262654203

K1C42_MOUSE 72 84 1.97E-17 0.245598735

K2C7_MOUSE 36 42 1.47E-09 0.245598735

ACTC_MOUSE 48 57 1.27E-12 0.241289986

K22O_MOUSE 67 81 1.61E-17 0.237007833

K22E_MOUSE 55 67 6.97E-15 0.235212719

K2C6A_MOUSE 82 100 2.26E-21 0.234956124

K1C16_MOUSE 55 68 2.83E-15 0.231753709

K2C74_MOUSE 38 49 7.51E-12 0.22220838

K2C5_MOUSE 95 125 4.75E-28 0.217764212

K1C17_MOUSE 73 98 8.54E-23 0.213436996

K2C4_MOUSE 49 66 5.76E-16 0.212728198

K1C14_MOUSE 85 118 8.29E-28 0.206400067

K1C15_MOUSE 64 93 3.85E-23 0.197183214

K1C10_MOUSE 105 156 4.46E-38 0.192857981

K2C75_MOUSE 74 110 1.69E-27 0.192757795

K2C71_MOUSE 32 48 4.89E-13 0.191021239

K2C79_MOUSE 38 58 1.28E-15 0.187727769

K2C1_MOUSE 53 81 3.62E-21 0.187483808

K1C13_MOUSE 62 98 7.45E-26 0.181275257

K2C73_MOUSE 50 85 7.75E-24 0.168548152

PLAK_MOUSE 31 53 1.51E-15 0.167594106

K2C1B_MOUSE 31 55 1.87E-16 0.161499775

KPYM_MOUSE 82 163 1.59E-48 0.144144861

Appendix 1: Total data set obtained from MaxQuant and Perseus analysis.

Appendix 2

57

58

59

60

61

62

63

64

Appendix 2: DAVID functional annotation analysis of the abbreviated data set (table 3.1)

65

Appendix 3: Global STRING analysis of the abbreviated data set (table 3.1). Nodes with no interactions have been

removed and remaining nodes have been clustered using MCL=4 means. STRING clustering uses a global interaction

score and clusters nodes with high interaction scores together. The PRC2 complex is circled in red.

Characterising the Interactome of EZH2 in Embryonic Stem Cells (3)

Documents