Genome Structure, Chromatin and The Nucleosome
I. Introduction
A. Introduction: The Human Genome Project
1. Since the first gene was sequenced in 1972, Molecular
Biologists have wished to take a systematic approach to
understanding the human genome
2. From the idea of the systematic approach, the Human Genome
Project was inaugurated in 1990
a. Billed as Biologys first big project
b. Projected time frame to finish the project was 15 years
3. A Molecular Biologists defines genome as the total set of DNA
molecules present in either an organism, a cell or an organelle
4. When the Human Genome and Human Genome project is being
discussed, the total set of DNA molecules in the cell is being
looked at
a. Chromosomes (24 types)
b. Mitochondrial DNA
5. The goal of the Human Genome project was to acquire
fundamental information about our genetic makeup
a. How many genes are present in the human genome
b. How many genes are actually similar to each other in DNA
sequence
c. How many base pairs are present in the Human Genome
d. How many base pairs are present in a chromosome
e. How many base pairs are not part of genes (intervening
sequences)
f. Understand how genes interact with each other
6. Although the basic goals of the project are listed above, the
major justification for a project lasting 13 years and costing $2.7
Billion were the possible medical benefits
a. Allow for more accurate diagnosis of inherited disorders
b. Provide a framework for the development of new therapies
c. Personalized medicine: Wide-scale application of mutation
screening which better allows a move from treating advanced
symptoms of disease to preventative disease
7. With personalized medicine comes significant ethical, legal
and societal implications-~20% of the $2.7 Billion spent went into
this category
B. Introduction: Human Genome Project Findings
1. The Human genome consists of roughly 3 Billion base pairs
spread across 46 chromosomes (23 pairs)
2. Human chromosomes can range in size from 50 million to 300
million base pairs
a. The smallest chromosome is chromosome 21
b. The largest chromosome is chromosome 1
3. The Human Genome has roughly 20,000 genes (protein coding
only)
a. Chromosome 1 has the most genes at about 3000
b. The Y chromosome has the least number of genes having only
70-200 genes
c. Chromosome 5 has lowest gene density
d. Chromosome 19 has the highest gene density
4. The 20,000 genes is only a rough estimate and was determined
through using computer based methods
a. Genes that encode proteins
b. Evidence of evolutionarily conserved gene sequences
5. The human genome does not have many more genes than other
less complex organisms
a. C. elegans has 20,000 genes and a genome size of 100 million
base pairs
b. Drosophila has 14,000 genes and a genome size of 165 million
base pairs
6. The reason why humans can get away with so few genes is that
on average one human gene encodes not just one protein but actually
three
7. Introduction: Human Genome Project Findings: Supplemental
Figure
II. Fitting the Human Genome Into The Cell
A. Fitting the Human Genome Into The Cell: Introduction
1. Fitting the 3.0 billion base pair genome into a human cell
nucleus is a difficult proposition for the cell
2. If you were to stretch out the 3.0 billion bases, one would
find that the DNA molecule would be about 2m long
a. A human cell nucleus is going to be approximately 10 m in
diameter
b. Represents a packing ratio of 1000 10,000 fold
3. Another way of looking at this is if we were to take a string
the length of the Willis Tower, and fit it into a tennis ball, how
easy would it be to do that?
4. Somehow, the cell must have a way to actually package all of
the DNA into the nucleus
5. The way in which eukaryotic DNA is packaged in the cell
nucleus is one of the wonders of macromolecular structure
G. Michael Blackburn, Nucleic Acids in Chemistry and Biology
(1990) p. 65
6. In order to fit the large human genome into a relatively
small nucleus human cells have an intricate mechanisms for
packaging the DNA
a. Dividing the genome into linear chromosomes
b. Winding the DNA around proteins called histones
B. Fitting the Human Genome Into The Cell: Dividing the Genome
Into Chromosomes
1. By definition, a chromosome is a structure of a long DNA
molecule and its associated proteins, which carries all or part of
the hereditary information for an organism
2. All organisms and viruses carry their genomes in the form of
chromosomes
3. The number of chromosomes an organism has varies, and does
not correlate with the complexity of the organism (see table
7-1)
4. In general, most prokaryotes have one circular chromosome
5. Most eukaryotes have multiple linear chromosomes of varying
size and copy number
a. Most eukaryotes have 2 copies of each chromosome
(homologs)
b. Some eukaryotes may have only one copy of each chromosome
c. In rare cases, some eukaryotes may have more than two copies
of each chromosome (Tetrahymena)
C. Fitting the Human Genome Into The Cell: The Purpose of
Dividing The Genome Into Chromosomes
1. Dividing the genome into chromosomes is actually very
important for all organism
a. Allows for easier compaction to fit the DNA into the nucleus
(or nucleoid)
b. Protects the DNA from damage
c. Provides an easy way to transmit the DNA to daughter cells
when a cell divides
2. For sexually reproducing organisms, having chromosomes allows
for easy transmission of half the genome to each gamete in a
process called meiosis
3. By diving the genome into chromosomes gives the cell the
ability to control DNA structure, and by proxy control gene
expression
a. Allows some genes to be expressed
b. Allows some genes to not be expressed
D. Fitting the Human Genome Into The Cell: The Human Chromosomal
Content
1. Humans have a total of 46 chromosomes, with two copies of
each chromosome
2. Humans have 23 pairs of chromosomes
3. Each member of a chromosome pair (homolog) will have the
exact same genes
4. However, each member will not necessarily have the same
sequence (variant or allele) for each gene
a. In cases where both chromosomes have the same sequence of a
specific gene, the individual is a homozygote for that gene
b. In cases where both chromosomes have different sequence for a
specific gene, the individual is a heterozygote for that gene
D. Fitting the Human Genome Into The Cell: The Human Chromosomal
Content and Sickle Cell Anemia (Supplemental Figure)
E. Fitting the Human Genome Into The Cell: The Human Chromosomal
Content and Sickle Cell Anemia (Supplemental Figure)
F. Fitting The Genome Into The Cell: Chromosome Structure
1. Eukaryotic chromosomes have several important features
a. Allow for greater DNA stability
b. Allow for efficient gene expression
c. Allow for control of gene expression
2. Each end of a linear chromosome is capped with a telomere for
maintenance of chromosome stability
3. Throughout each chromosome reside origins of replication,
which direct the start of DNA synthesis
4. Each eukaryotic chromosome will have a more or less centrally
located centromere, which is important for chromosome segregation
in mitosis/meiosis
a. An elaborate protein complex known as the kinetochore binds
the centromere
b. Microtubules will in turn interact with the kinetochore to
pull the chromosomes to opposite poles of the cell during
mitosis
5. The centromere divides the chromosome into arms
a. p-arm is the short arm (petit)
b. q-arm is the long arm
6. Fitting The Genome Into The Cell: Chromosome Structure
(Supplemental Figure)
7. It is not the norm to have chromosomes with multiple
centromeres
8. However, this does occasionally happen in humans
9. Having a chromosome with two centromeres creates a problem
during mitosis as each centromere may be getting pulled in opposite
directions leading to chromosomal breakage
10. Having no centromeres is also not the norm in human
populations, and results in completely random segregation during
mitosis
III. Chromatin Formation
A. Winding The DNA Into Chromatin: Introduction
1. Although dividing the genomic DNA into chromosomes helps with
packaging, it clearly is not enough
2. In addition to having chromosomes, the chromosomal DNA is
wound around proteins to form chromatin
3. Chromatin is the genomic DNA and its associated proteins
4. By winding the chromosomal DNA around proteins, it is very
easily fit into the nucleus of the cell
5. To form chromatin, the genomic DNA must be wound around a
specific structure called a nucleosome
6. Each chromosome will have many nucleosomes associated with
it
7. Each nucleosome contains the following
a. Chromosomal DNA
b. Histone proteins around which the DNA is wound
8. The nucleosome allows for compaction of the DNA about
6-fold
9. This is only the first stage in compaction, as the genome
needs to be compacted 100-10,000 fold to efficiently fit into the
nucleus
B. Winding The DNA Into Chromatin: The Histone Proteins
1. Each nucleosome consists of a core of eight histone proteins
and the chromosomal DNA wound around them
2. The histone proteins were discovered back in 1928 by Albrecht
Kossel, who isolated from goose erythrocytes
a. Discovered that they were basic (have a positive charge)
b. Since then histones have been discovered in many other
organisms
3. Eukaryotic nucleosomes generally have a total of 5 abundant
histones divided into two sub-categories
a. Core histones, which show strong conservation
b. Linker histones
4. The Core histones, form the core of the nucleosome, and will
form an octamer (complex of 8 protiens) and are found in the cell
in roughly equal amounts
a. Histone H2A
b. Histone H2B
c. Histone H3
d. Histone H4
5. The core histones to be basic,
a. Rich in arginine
b. Rich in lysine
6. The protein core of a nucleosome is a disk-shaped structure
that assembles in an order fashion only in the presence of DNA
(discovered through in vitro experiments)
7. Without DNA, the core histones will create intermediate
assemblies, which are mediated by the conserved histone fold domain
(discovered through in vitro experiments
B. Winding The DNA Into Chromatin: The Core Histone Proteins
Come Together To Form an Octomer
1. In order to form the core, the Histone H3 and Histone H4
proteins form a central tetramer
a. H3 and H4 proteins form heterodimers
b. Two H3/H4 heterodimers interact to form the tetramer
2. Once the tetramer H3/H4 tetramer forms, it binds to DNA to
start the winding process (~147 bp of DNA)
3. Next, Histones H2A and H2B form heterodimers which then bind
onto each end of the H3/H4 tetramer
4. The DNA that is wound around the core histones is considered
the core DNA and is wound ~1.65 times around the octomer like
thread around a spool
5. The fifth histone present in a nucleosome is the linker
histone
6. There are two possible types of linker histones
a. Histone H1
b. Histone H5 (only found in avian erythrocytes)
7. Compared to the core histones, the linker histones are
larger, with a molecular weight of greater than 20 kD
8. Histone H1 is half as abundant in the cell as the core
histones
9. Histone H1 is not part of the core nucleosome particle,
instead it binds to the linker DNA
10. If one counts the DNA wound around the core nucleosomes and
the linker DNA which is associated with Histone H1 there are about
180 bp (in humans)
11. The role of the linker Histone H1 is to induce tigher
wrapping of the DNA around the nucleosome
12. Histone H1 binding relative to the nucleosome is different
than the core histones
a. Only one Histone H1 protein binds
b. Histone H1 binding sites are located asymetrically with
respect to the nucleosome
13. The two Histone H1 binding sites are as follows:
a. The linker DNA at one end of the nucleosome (only linker DNA
on one side of the nucleosome is protected from mnase
digestion)
b. At the mid-point of the 147 bp associated with the core
histone complex
14. In the end, the whole nucleosome includes the following
a. Core histones and the DNA that is wrapped around them
b. Linker histone, and the DNA that is covered by it
C. Winding The DNA Into Chromatin: Determining How Many Base
Pairs of DNA Are Associated With A Nucleosome
1. In order to figure out the amount of DNA wound around a
nucleosome, the enzyme Micrococcal nuclease was used
a. Micrococcal nuclease will digest free double stranded DNA
b. Micrococcal nuclease will not digest double stranded DNA
associated with proteins
2. To perform the Mnase experiment chromatin was isolated and
subjected to treatment with a low concentration of micrococcal
nuclease
3. Once treatment was finished, protein components were removed,
leaving only the DNA that was associated with the nucleosomes
4. The DNA was then run on an agarose gel
5. On the agarose gel, they saw DNA bands at multiples of 180
bp
6. Winding The DNA Into Chromatin: Determining How Many Base
Pairs of DNA Are Associated With A Nucleosome
7. As we saw before, the 180 bp refers to the amount of DNA
wound around the core nucleosome and the linker DNA associated with
Histone H1
8. In order to figure out how much DNA was wound around the core
nucleosome, the chromatin was exposed to more extensive micrococcal
nuclease treatment
a. The linker Histone (Histone H1) is not tightly bound to the
DNA
b. Extensive treatment with Mnase results in the release of
Histone H1
c. The release of Histone H1 allows for the Mnase to in turn
degrade the rest of the DNA Mnase was bound to
9. After extensive treatment only a 147 base pair band was
seen-indicative of the amount of DNA wound around the core
nucleosome
D. Winding The DNA Into Chromatin: The Winding of the DNA Around
The Core Histones
1. Although the nucleosome is not perfectly symmetryical, it
does have a twofold axis of symmetry called the dyad axis
a. Imagine the nucleosome as a clock, with the midpoint of the
147 bp of DNA at the 12 oclock position
b. This places the ends of the DNA at the 11 oclock and 1 oclock
positions
2. Throughout the nucleosome, there are 14 distinct sites of
contact between the core histones and the DNA
3. An interaction occur each time the minor groove of the DNA
faces the histone octamer
4. Winding The DNA Into Chromatin: The Winding of the DNA Around
the Core Histones
5. The core histone proteins bind the DNA through H-bonds in a
sequence non-specific manner (minor groove binding)
a. Majority of H-bonds are between the Histones and the oxygens
in the phospho-diester backbone
b. Few H-bonds are formed between the histones and the
nitrogenous bases
6. In order to get the DNA wound around a nucleosome, it must be
bent
7. In general DNA is a rigid molecule due to the fact the fact
the negatively charged phosphates in the backbone (resists
bending)
8. The total number of Hydrogen bonds between the core histones
and the DNA is twice as many as a typical sequence specific DNA
binding protein
a. Core histones interact with the DNA through 40 Hydrogen
bonds
b. Typical Sequence specific DNA binding proteins interact with
the DNA through 20 hydrogen bonds
9. Two aspects of the core histone-DNA interaction result in
allowing the rigid DNA to bend
a. Increased hydrogen bonding
b. Basic nature of the histones masks the negative charge of the
phosphate groups
IV. Packaging The Genome
A. Packaging The Genome: Introduction
1. Winding the DNA around nucleosomes only compacts the DNA 6
fold-well short of the needed 1000-10,000 fold compaction that is
necessary
2. To further look at DNA compaction, chromatin was carefully
isolated from nuclei and viewed by electron microscopy using a low
salt extraction
3. In 1975, Chambons lab published electron micrographs of the
eukaryotic genome revealed the existence of uniformly sized
particles with a repeating pattern
4. The genome looked like Beads-on-a-string by electron
microscopy
a. The beads represent DNA wrapped around the histone core
octamer
b. The string represents the free DNA helix between
nucleosomes
c. Note: these methods of extraction are non-physiological and
therefore, it is unknown whether this conformation exists in
vivo-however it is likely this conformation is not present in
vivo
B. Packaging The Genome: The 30 nm Fiber
1. The binding of Histone H1 not only increases how tight the
DNA is wound around the nucleosome, but also stabilizes higher
order chromatin structures
2. If the chromatin is studied In the test tube, as salt
concentrations are increased, the addition of Histone H1 results in
the nucleosomal DNA forming a 30 nm fiber
3. Chromatin in situ experiments using electron microscopy also
further revealed that the arrays of nucleosomes formed a more
compact fiber
a. Revealed the 30 nm fiber (30 nm in diameter)
b. Difficult to maintain the 30 nm structure upon purification
and so the structure remains not well understood, as it is present
in vivo
4. Two possible models have been proposed for the structure of
the 30 nm fiber-based on high salt extractions
a. Selenoid model
b. Zig-Zag model
5. The solenoid model involves six consecutive nucleosomes
arranged per turn of the higher order structure
6. This model was initially supported by both electron
microscopy and X-ray diffraction studies
a. The 30 nm fiber has a helical pitch of ~11 nm which is the
diameter of the nucleosome disc
b. Suggests that the 30 nm fiber is composed of nucleosome discs
stacked on edge in the form of a helix
c. Flat surfaces on either side of the core histone octamer face
each other
d. Linker DNA is buried on the inside of the helix, but never
passes through the central axis-means having shorter linker DNA
favors this conformation
7. The zig-zag ribbon structure twists and supercoils, and
through X-ray crystallography has been seen in transcriptionally
active cells
a. The zig-zag model is based on the zig-zag pattern of
nucleosomes formed upon Histone H1 addition in vitro
b. Requires the linker DNA to pass through the central axis in
straight form
c. Longer linker DNA favors this conformation
8. Recent data has suggested that perhaps the Zig-zag model is
the physiological model as compared to the once thought to be the
physiologically relevant solenoid model
a. Solenoid model does not form at physiological salt
concentrations (150 nM)-probably not relevant in situ
b. Controversy is still unresolved because different eukaryotic
species have different characteristic lengths of linker DNA those
that have longer linker DNA may have their chromatin present in the
zig-zag conformation and those with shorter linker DNA may have
their chromatin present in the solenoid conformation
C. Packaging The Genome: Interactions Between Core Nucleosomes
In Formation Of The 30 nm Fiber
1. Although Histone H1 addition is required in vitro to
stabilize the 30 nm fiber other interactions are necessary for its
formation
2. The core nucleosomes must also be able to interact with each
other
3. The core nucleosomes interact with each other through the
presence of the amino-terminal tails of the core histones (note:
the amino-terminal tails have been shown to have no role in DNA
binding)
4. If core histones lacking the amino-terminal tails are used in
the in vitro chromatin assay, then the 30 nm fiber fails to
form
5. The 3-dimensional crystal structure of the nucleosome shows
that each of the amino-terminal tails of Histones H2A, H3 and H4
interacting with adjacent nucleosome cores in the crystal
lattice
6. The core Histone H4 has a positively charged tail, which can
interact with a negatively charged portion of the histone fold
domain Histone H2A
7. The negatively charged portion of histone H2A is conserved,
but plays no role in DNA binding
8. It is thought that this interaction is particularly important
for 30 nm fiber formation
D. Packaging The Genome: Formation of Loop Domains To Increase
Compaction
1. Building the 30 nm fiber increases the compaction of the DNA
by approximately 40 fold
2. This is still significantly less than the 1000-10,000 fold
compaction ratio we need
3. Additional folding of the 30 nm fibers are required to
compact the DNA even further
4. The exact nature of the folded loop structure remains
unclear
5. However, one popular model proposes that the 30 nm fiber
forms loops of 40-90 kb of DNA that are held together at their
bases by the nuclear scaffold
a. Nuclear scaffold is made of proteins
b. Nuclear scaffold gives the nucleus structure
c. The true nature of the scaffold is still unknown
6. Two types of proteins have been identified to be part of the
nuclear scaffold
a. Topoisomerase II (Topo II)
b. SMC proteins
7. The 30 nm fibers are tethered to the nuclear scaffold at
their base
8. The loops extend out and away from the base
E. Packaging The Genome: Heterochromatin and Euchromatin
1. Not all of the DNA in the cell is equally packaged
a. Some DNA is more tightly packaged
b. Some DNA is more loosely packaged
2. Early studies of the chromosomes were performed using DNA
dyes-these divided chromosomal regions into two categories,
depending on how tightly the DNA was packaged
a. Euchromatin
b. Heterochromatin
3. Heterochromatin is more densely packaged and thus more
readily stained
4. Euchromatin has a more open structure and thus stained more
poorly with dyes
5. (Supplemental Figure) Packaging The Genome: Heterochromatin
and Euchromatin
6. The difference between heterochromatin and euchromatin
structure is how the nucleosomes are packaged into higher order
structures
a. 30 nm fiber
b. Loop domains
7. In heterochromatin the nucleosomes readily assemble into
highly organized higher-order chromatin structures
8. In euchromatin, the nucleosomes are found to be in much less
organized assemblies
9. Heterochromatin and euchromatin have significant roles in the
cell
10. The state of the chromatin in any one region of a
chromosome/genome greatly affects gene expression
11. The euchromatin is much less densely packed
a. Genes in euchromatic regions are accessible by the machinery
that drives gene expression
b. Genes in euchromatic regions can be expressed
12. The heterochromatin is more tightly packed
a. Genes in heterochromatic regions are not accessible by the
machinery that drives gene expression
b. Genes in heterochromatic regions are not expressed
13. In the human body, heterochromatic and euchromatic regions
are different for each different cell type
V. Regulation of Chromatin Structure
A. Regulation of Chromatin Structure: Introduction
1. Chromatin structure is not static within the nucleus of the
cell
2. Chromatin structure changes dynamically to allow for
short-term changes in gene expression
3. The dynamic nature of the DNA binding to the histone core
structure is biologically relevant as sequence specific DNA binding
proteins strongly perfer to bind histone free DNA
4. As a result of intermittent, spontaneous unwrapping, a
protein can gain access to its DNA binding site with a probability
of 1 in 50-1 in 100,000
B. Regulation of Chromatin Structure: Where Sequence Specific
DNA Binding Proteins Bind The DNA
1. This change in winding either allows or blocks binding of a
variety of sequence specific DNA binding proteins that promote gene
expression
a. Tighter winding blocks binding
b. Looser winding allows binding
2. There are preferred sites on the nucleosome free DNA where
sequence specific DNA binding proteins bind
3. If the more central the binding site is in when the DNA is
wound, the less accessible it will be when the DNA is unwound
4. If the binding site is closer to the ends when the DNA is
wound, the more accessible the binding site will be when the DNA is
unwound
C. Regulation of Chromatin Structure: Histone Modification
Affects DNA Winding
1. Modification of the histones can have a great affect on how
tightly the DNA is wound around the histone core
2. Since the DNA is normally tightly wrapped around the histone
core, most modifications promote a loosening of the DNA from the
histone core
3. There are three major modifications of the tails of the core
histones that change the ability of the DNA to wind
a. Acetylation
b. Methylation
c. Phosporylation
4. Methylation can either work to wind the DNA tighter or looser
depending on the location of the methylation site
5. Phosphorylation generally acts to wind the DNA tighter to
allow for condensation of the chromosomes during mitosis
D. Regulation of Chromatin Structure: Histone Acetylation
1. The role of acetylation in vivo is to promote gene expression
by allowing for unwinding the DNA wound around the nucleosome
2. Acetyl groups (C2H3O2) promotes unwinding of the DNA around
the nucleosome in two ways
a. Acetyl groups are bulky and can displace DNA
b. Acetyl groups have a negative charge, which reduces the
positive charge on the histones allowing the DNA to become
unwound
3. Acetyl groups are placed on the core histones by an enzyme
known as acetyl-transferase
4. Generally lysines within the amino-terminal tails of the core
histone are targeted for acetylation by acetyl-transferase
5. As an example, acetylation of lysines 8 and 16 of Histone H4
result in an unwinding of DNA
6. This unwinding of DNA results in increased gene expression
due to binding of proteins that promote active gene expression
7. In reverse, deacetylation results in tighter winding of the
DNA and an inhibition of gene expression
E. Regulation of Chromatin Structure: Histone Acetylation and
Colon Cancer
1. About 10 years ago, several papers were published that
suggested both histone and hyperacetylation and hypoacetylation are
important in the neoplastic process depending on the target gene
involved
2. Effects of histone acetylation on the expression of p21 is
critically important in regards to the formation of colon
cancer
3. Colon cancer is a cancer that forms in the lower part of your
digestive system, the large intestine
4. Most colon cancers start as benign clumps of cells (tumors)
called polyps
5. In time some polyps become malignant and form cancers
6. In general most colon cancer patients are over 50 years of
age
7. There is a higher incidence of colon cancer among African
Americans as compared to the general population
8. The initial symptoms of colon cancer generally only affect
the lower digestive tract
a. A change in bowel habits lasting more than a several
weeks
b. Rectal bleeding or blood in the stool
c. A feeling the bowel does not properly empty
d. Weakness and fatigue
e. Unexplained weight loss
9. On a molecular scale, high levels of DNA damage have been
seen in colon carcinomas
10. Increased intake dietary fiber has long been linked to a
decreased chance of contracting colon cancer
11. It is thought that dietary fiber can induce the expression
of the p21 gene
12. Expression of p21 results in promoting cell cycle arrest
which can allow the DNA repair machinery to repair the DNA
13. In the intestine, dietary fiber is broken down by resident
bacteria to SFCAs (short chain fatty acids) along with carbon
dioxide, water, methane and hydrogen gases
a. Acetate
b. Propionate
c. Butyrate
14. Butyrate has been shown to inhibit HDACs in the p21 gene
region
15. This leaves core histones in this region hyperacetylated
a. The DNA remains in an unwound or loose conformation
b. The p21 gene is highly expressed
c. The process of colon carcinogenesis is in part inhibited