DUF1220 Domains & the Search for the Genes that Made Us Human James M. Sikela, Ph.D. Human Medical Genetics, Neuroscience, & Comparative Genomics Programs,
Post on 18-Dec-2015
213 Views
Preview:
Transcript
DUF1220 Domains & the Search for the Genes that Made Us Human
DUF1220 Domains & the Search for the Genes that Made Us Human
James M. Sikela, Ph.D.Human Medical Genetics, Neuroscience, &
Comparative Genomics Programs,Department of Biochemistry & Molecular Genetics, University of Colorado School of
Medicine
Genomics CourseFebruary 28, 2012
James M. Sikela, Ph.D.Human Medical Genetics, Neuroscience, &
Comparative Genomics Programs,Department of Biochemistry & Molecular Genetics, University of Colorado School of
Medicine
Genomics CourseFebruary 28, 2012
Key PointsKey Points• First gene-based and first genome-wide study of
lineage-specific gene duplication and loss in human and primate evolution
• Dramatic human-specific increase in copy number of DUF1220 protein domains
• DUF1220 copy number linked to evolution of brain size
• Selection of evolutionarily adaptive genome sequences may be driving disease, e.g. 1q21.1
• First gene-based and first genome-wide study of lineage-specific gene duplication and loss in human and primate evolution
• Dramatic human-specific increase in copy number of DUF1220 protein domains
• DUF1220 copy number linked to evolution of brain size
• Selection of evolutionarily adaptive genome sequences may be driving disease, e.g. 1q21.1
Primate EvolutionPrimate Evolution
New World Monkeys (e.g. squirrel monkey,spider monkey)New World Monkeys (e.g. squirrel monkey,spider monkey)
Old World Monkeys (e.g. baboon, rhesus, etc.)Old World Monkeys (e.g. baboon, rhesus, etc.)
GibbonsGibbons
OrangutanOrangutan
GorillaGorilla
HumanHuman
ChimpChimp
BonoboBonoboB/C = ~ 2C/H = ~ 5HC/G = ~ 8HCG/O = ~ 13HCG/O/Gib = ~20Hom/OWM = ~ 25HomOWM/NW = ~ 40
40 MYA
25 MYA
20 MYA
13 MYA
8 MYA
5 MYA
2 MYA
Chimpanzee
Gorilla
Bonobo
Orangutan
More Primates!
---- something has changed!---- something has changed!
Human CharacteristicsHuman Characteristics• Body shape and
thorax• Cranial properties
(brain case and face)• Small canine teeth• Skull balanced upright
on vertebral column• Reduced hair cover• Enhanced sweating• Dimensions of the
pelvis• Elongated thumb and
shortened fingers• Relative limb length
• Body shape and thorax
• Cranial properties (brain case and face)
• Small canine teeth• Skull balanced upright
on vertebral column• Reduced hair cover• Enhanced sweating• Dimensions of the
pelvis• Elongated thumb and
shortened fingers• Relative limb length
• Neocortex expansion• Enhanced language &
cognition• Advanced tool making
• Neocortex expansion• Enhanced language &
cognition• Advanced tool making
modified from S. Carroll, Nature, 2005
Reports of “human-specific” genesReports of “human-specific” genes
• FOXP2– Mutated in family with language disability
• ASPM/MCPH– Mutated in individuals with microcephaly
• HAR1F– Gene sequence highly changed in humans
• DUF1220 protein domains– Highly increased in copy number in
humans; expressed in important brain regions
• FOXP2– Mutated in family with language disability
• ASPM/MCPH– Mutated in individuals with microcephaly
• HAR1F– Gene sequence highly changed in humans
• DUF1220 protein domains– Highly increased in copy number in
humans; expressed in important brain regions
Molecular Mechanisms Underlying Genome Evolution
Molecular Mechanisms Underlying Genome Evolution
• Single nucleotide substitutions- change gene expression &
structure• Genome rearrangements• Gene duplication
- copy number change: gene dosage
- redundancy as a facilitator of innovation
• Single nucleotide substitutions- change gene expression &
structure• Genome rearrangements• Gene duplication
- copy number change: gene dosage
- redundancy as a facilitator of innovation
Gene Duplication & Evolutionary Change
•“There is now ample evidence that gene duplication is the most important mechanism for generating new genes and new biochemical processes that
have facilitated the evolution of complex organisms from primitive
ones.”- W. H. Li in Molecular
Evolution, 1997
•“Exceptional duplicated regions underlie exceptional biology”
- Evan Eichler, Genome Research 11:653-656,
2001
Fig 1. Measuring genomic DNA copy number alteration using cDNA microarrays (array CGH). Fluorescence ratios are depicted in a pseudocolor scale, such that red indicates increased, and green decreased, gene copy number in the test (right) compared to reference sample (left).
Interhominoid cDNA Array-Based Comparative Genomic Hybridization
(aCGH)
Interhominoid cDNA Array-Based Comparative Genomic Hybridization
(aCGH)
Experimental Design Experimental Design
• Carry out pairwise cDNA aCGH comparisons between human and other hominoid species
• Use a >39,000 cDNA microarray representing >29,000 human genes
• Hybridize human genomic DNA (reference sequence: cy3/green) and other hominoid genomic DNAs (test sequence: cy5/red) simultaneously to the microarray
• Visualize aCGH signals “gene-by-gene” along each chromosome across five species: human (n=5), bonobo (n=3), chimpanzee (n=4), gorilla (n=3) and orangutan (n=3)
• Carry out pairwise cDNA aCGH comparisons between human and other hominoid species
• Use a >39,000 cDNA microarray representing >29,000 human genes
• Hybridize human genomic DNA (reference sequence: cy3/green) and other hominoid genomic DNAs (test sequence: cy5/red) simultaneously to the microarray
• Visualize aCGH signals “gene-by-gene” along each chromosome across five species: human (n=5), bonobo (n=3), chimpanzee (n=4), gorilla (n=3) and orangutan (n=3)
Whole Genome Caryoscope Image of Interhominoid aCGH DataWhole Genome Caryoscope Image of Interhominoid aCGH Data
Human & Great Ape Genes Showing Lineage-Specific Copy Number Gain/LossHuman & Great Ape Genes Showing Lineage-Specific Copy Number Gain/Loss
Fortna, et al, PLoS Biol. 2004Fortna, et al, PLoS Biol. 2004
Summary of Human/Primate ArrayCGH Results
Summary of Human/Primate ArrayCGH Results
• First genome-wide and first gene-based aCGH comparison of human and nonhuman primate gene copy number variation (Fortna, et al 2004)
• 1,004 (4,159) genes identified that showed lineage-specific changes in copy number
• Time machine of evolutionary copy number change
• Gene candidates to underlie lineage-specific traits• Genes identified represent most of major lineage-
specific gene duplications and losses over the last 60 million years of human and primate evolution (Dumas, et al 2007)
• First genome-wide and first gene-based aCGH comparison of human and nonhuman primate gene copy number variation (Fortna, et al 2004)
• 1,004 (4,159) genes identified that showed lineage-specific changes in copy number
• Time machine of evolutionary copy number change
• Gene candidates to underlie lineage-specific traits• Genes identified represent most of major lineage-
specific gene duplications and losses over the last 60 million years of human and primate evolution (Dumas, et al 2007)
0 50 100 140 170 2501p36 1p34 1p31 1p22 1q21 1q23 1q32 1q41Mb
1p1320 210
1
2
0 90 130 170 200 2402p24 2p11 2q14 2q31 2q33 2q37Mb 2q2130 110 2p1650
3
Mb 0 80 160 180 2003p25 3p12 3q13 3q263q2520 130 3p21 50 3q21 3q28
Mb 0 100 1904p16 4q24 4q12 4q3410 80 4p12 50 4q31
4
140
Mb
5
0 100 1905p15 5q23 5q1320 70 5q1150 5q34150130
Mb
6
0 50 1706p25 6q126p21 30 130 10 6p22 40 6q1490 6q22 6q25
Mb
7
0 100 1607p21 7q11 7q21 60 30 7p14 90 7q31130 7q35140 7q22
Mb
8
0 120 1508p21 8q12 40 20 8p12 80 8q21100 8q2460
Mb
9
120 1509p23 9q21 40 30 9p13 80 9q22 100 9q34600
120 14010p15 10q21 10q24 40 2010p11 80 100 10q26500Mb
10
10q25
Mb
11
90 14011p15 11q12 11q13 20 10 11p14 70 11q24500 80 120 11q22 11q14
1 2 3
12
Mb 110 13012p13 12q13 12q14 30 10 12p12 70 12q24500 90 12q21
13
Mb 13q12 13q21 13q33 3013q14 110500 90
14
Mb 14q11 14q31 50 3014q13 700 1009014q32
15
Mb 15q13 700 10015q26 20 40 5015q21 15q2415q22
4 5
5q15
8q22
9q33
14q22
6 7
8
9
10
11
12 13
14
15
16Mb
16p13 700 9016q24 10 20 3016p12 16q22 5016q12
17p13 70 9017q23 10 20 3017q11 17q21 5017q12 17q25
17Mb
18Mb
18p110 10 20 8018q12 5018q21
19Mb
19p13 500 60 10 20 40 19q12 19q13 19p11
20Mb
20p130 10 30 6020q11 5020q13 20
21Mb
0 30 40 5021q22
22Mb
0 3022q11 50 22q13 20 40
XMb
Xp220 50 150Xp11 130Xq21 20 100 70 Xq26 Xq28
YMb
0Yp11 50 20
19q11
16
17
18
19
20
21 22
23
2>_0.5<_
Test/Reference ratio:
1
Human (Homo Sapiens)Bonobo (Pan Paniscus)Chimpanzee (Pan Troglodytes)
Orangutan (Pongo Pygmaeus)Gorilla (Gorilla Gorilla)
3
H
B
C
G
O
6
H
B
C
G
O
9
H
B
C
G
O
13
H
B
C
G
O
Human & Great Ape Genes Showing Lineage-Specific Copy Number Gain/LossHuman & Great Ape Genes Showing Lineage-Specific Copy Number Gain/Loss
Fortna, et al, PLoS Biol. 2004Fortna, et al, PLoS Biol. 2004
““This (Fortna, et al, 2004) is the first This (Fortna, et al, 2004) is the first time that copy number changes among time that copy number changes among
apes have been assayed for the vast apes have been assayed for the vast majority of human genes, and we can majority of human genes, and we can
expect that the biological consequences expect that the biological consequences of the 140 human-specific copy number of the 140 human-specific copy number changes identified in this study will be changes identified in this study will be heavily investigated over the coming heavily investigated over the coming
years. “years. “
---M. Hurles, ---M. Hurles, PLoS BiolPLoS Biol. 2004. 2004
DUF1220Repeat Unit
Popesco, et al, Science 2006
InterPro-predicted DUF1220-containing proteins (NBPF family*)
*Vandepoule, et al, Mol. Biol. & Evol, 2005*Vandepoule, et al, Mol. Biol. & Evol, 2005
Copy Number of DUF1220 (Q8IX62/17-33) Sequences in Primate Species
0
10
20
30
40
50
60
70
Q-P
CR
Pre
dic
ted
Co
py
Nu
mb
er
70
60
50
40
30
20
100
Q-P
CR
Pre
dic
ted
Co
py
N
um
ber
Copy Number of DUF1220 (Q8IX62/17-33)Sequences in Primate Species
Hu
ma
n
Bo
no
bo
Ch
imp
Go
rill
a
Ora
ng
uta
n
Gib
bo
n
Mac
aqu
e
Bab
oo
n
Summary of aCGH, Q-PCR and BLAT results:
Summary of aCGH, Q-PCR and BLAT results:
• DUF1220 domains are highly amplified in human, reduced in great apes, further reduced in Old & New World monkeys, single or low copy non-primate mammals and absent in non-mammals
• DUF1220 domains are highly amplified in human, reduced in great apes, further reduced in Old & New World monkeys, single or low copy non-primate mammals and absent in non-mammals
DUF1220 copy number in Animal Genomes
Genome PDE4DIP
DUF1220
Total
DUF1220
NBPF
genes
Human 2 268 21
Chimp 3 125 15
Gorilla 3 99 15
Orangutan 4 92 11
Macaque 1 35 10
Marmoset 1 30 10
Rabbit 1 8 3
Mouse 1 1 0
Rat 1 1 0
Guinea Pig 1 1 0
Genome PDE4DIP
DUF1220
Total
DUF1220
NBPF
genes
Cow 1 6 2
Pig 1 3 1
Horse 1 8 3
Dog 1 3 1
Panda 1 2 1
Opposum 1 1 0
EuarchotanglinesEuarchotanglines LaurasiatheriaLaurasiatheria
AfrotheriaAfrotheria
MetatheriaMetatheria
Elephant 1 1 1
PrototheriaPrototheriaPlatypus 1 1 0
Chicken 0 0 0
Lizard 0 0 0
Frog 0 0 0
Zebrafish 0 0 0
Other VertebratesOther VertebratesA total of 40 genomes were A total of 40 genomes were searched, but only the 22 with 4X searched, but only the 22 with 4X coverage or higher are displayed.coverage or higher are displayed.
DUF1220 Copy Number Statistics in hg19 build
DUF1220 Copies
Total in Human Genome 272
Total amplified HLS DUF1220 Triplets 129
Total DUF1220 in Last Common Ancestor of Homo/Pan 102
Total of Newly Added Copies in Human Lineage 167
Total Copies Added via Domain Amplification 146
Total Copies Added via Gene Duplication 21
Average Number Added to Human Lineage every million years 28
This table shows the unprecedented DUF1220 copy number increase in the human lineage. The primary mechanism for this expansion was domain amplification via hyper-amplification of the HLS DUF1220 triplet.
Sequences encoding DUF1220 domains
Sequences encoding DUF1220 domains
• Show a major copy number burst in primates
• Are increasingly amplified generally as a function of a species evolutionary proximity to humans, where the greatest number of copies (270) is found
• Show signs of positive selection• Are highly expressed in brain regions
associated with higher cognitive function• In brain show neuron-specific expression
preferentially in cell bodies and dendrites
• Show a major copy number burst in primates
• Are increasingly amplified generally as a function of a species evolutionary proximity to humans, where the greatest number of copies (270) is found
• Show signs of positive selection• Are highly expressed in brain regions
associated with higher cognitive function• In brain show neuron-specific expression
preferentially in cell bodies and dendritesPopesco, et al, Science 2006Popesco, et al, Science 2006
•Recurrent Reciprocal 1q21.1 Deletions and Duplications Associated with Microcephaly or Macrocephaly and Developmental and Behavioral Abnormalities
Brunetti-Pierri, et al, Nature Genetics 2008•Recurrent Rearrangements of Chromosome 1q21.1 and Variable Pediatric Phenotypes
Mefford, et al, N. Engl. J. Med. 2008
•Recurrent Reciprocal 1q21.1 Deletions and Duplications Associated with Microcephaly or Macrocephaly and Developmental and Behavioral Abnormalities
Brunetti-Pierri, et al, Nature Genetics 2008•Recurrent Rearrangements of Chromosome 1q21.1 and Variable Pediatric Phenotypes
Mefford, et al, N. Engl. J. Med. 2008
1q21.1 Deletions* Linked to Microcephaly
1q21.1 Duplications* Linked to Macrocephaly
1q21.1 Deletions* Linked to Microcephaly
1q21.1 Duplications* Linked to Macrocephaly
We note that these CNVs encompass or are immediately flanked by DUF1220 sequences (Dumas & Sikela, Cold Spring Harbor Symposium Quant. Biol., 2009)
We note that these CNVs encompass or are immediately flanked by DUF1220 sequences (Dumas & Sikela, Cold Spring Harbor Symposium Quant. Biol., 2009)
*Implies human brain size directly related to the dosage of one or more genes in these 1q21.1 CNVs*Implies human brain size directly related to the dosage of one or more genes in these 1q21.1 CNVs
DUF1220/NBPF Sequences & Recurrent Disease-associated 1q21.1 CNVs
Association (p<0.0001) of human head circumference (FOC Z-score) & DUF1220 copy
number
Association (p<0.0001) of human head circumference (FOC Z-score) & DUF1220 copy
number
Head Circumference (FOC Z-Score) vs. DUF1220 Copy Number
-6
-4
-2
0
2
4
6
20 30 40 50 60 70 80
Q-PCR-Predicted DUF1220 Copy Number
FO
C Z
-Sc
ore
Class II Deletion
Class I Deletion
Duplication
Copy number of genes in the 1q21.1-q21.2 region versus brain size
•46 1q21.1 genes compared along with brain size across 5 primate species
•DUF1220 shows the most
dramatic human-specific copy number increase.
•The evolutionary increase in DUF1220 copy number parallels the increase in brain size.
Human Chimp Orangutan Macaque MamosetBrain Size (g) 1350 380 390 88 7Copy #DUF1220 272 125 92 35 30PPIAL4 5 1 1 0 0LOC728855 5 2 2 2 1FAM72D 2 0 0 0 0SRGAP 1 0 0 0 0PDE4DIP 3 3 4 1 1SEC22B 1 1 1 1 1NOTCH2NL 1 1 1 1 1HFE2 1 1 1 1 1TXNIP 1 1 1 1 1POLR3 2 2 2 2 2ANKRD34 1 1 1 1 1ANKRD35 1 1 1 1 1LIX1L 1 1 1 1 1RBM8A 1 1 1 1 1GNRHR2 1 1 1 1 1PEX11B 1 1 1 1 1ITGA10 1 1 1 1 1NUDT17 1 1 1 1 1RNF115 1 1 1 1 1CD160 1 1 1 1 1PDZK1 3 1 1 1 1GPR89 3 1 1 1 1PRKAB2 1 1 1 1 1PDIA3P 1 1 1 1 1FMO5 1 1 1 1 1CHD1L 1 1 1 1 1BCL9 1 1 1 1 1ACP6 1 1 1 1 1GJA5 1 1 1 1 1GJA8 1 1 1 1 1LOC645166 1 0 0 0 0FCGR1 2 1 1 1 1SV2A 1 1 1 1 1BOLA1 1 1 1 1 1MTMR11 1 1 1 1 1OTUD7B 1 1 1 1 0SF3B4 1 1 1 1 1VPS45 1 1 1 1 1PLEKHO1 1 1 1 1 1ANP32E 1 1 1 1 1PRPF3 1 1 1 1 1C1orf54 1 1 1 1 1MRPS21 1 1 1 1 1CA14 1 1 1 1 1C1orf51 1 1 1 1 1APH1A 1 1 1 1 1
DUF1220 Copy Number Versus Brain Size
* Neandertal DUF1220 copy number is estimate based on sequence read depth from the Neandertal genome (Green et al 2010).
-but correlation is not causation
Factors that must be reconciled with model linking 1q21.1 instability, evolutionary
adaptation & recurrent disease
Factors that must be reconciled with model linking 1q21.1 instability, evolutionary
adaptation & recurrent disease
• Evolutionarily rapid DUF1220 copy number increase– Estimate, on average, 28 more DUF1220
domains added to human genome every 1 million years since Homo/Pan split
• Underlying mechanism must account for continued, recurrent DUF1220 increases
• Underlying mechanism must account for excess of 1q21.1 disease-associated CNVs containing dosage-sensitive genes
• Evolutionarily rapid DUF1220 copy number increase– Estimate, on average, 28 more DUF1220
domains added to human genome every 1 million years since Homo/Pan split
• Underlying mechanism must account for continued, recurrent DUF1220 increases
• Underlying mechanism must account for excess of 1q21.1 disease-associated CNVs containing dosage-sensitive genes
Increased 1q21.1 Instability
Increased 1q21.1 Instability
Increase inDUF1220
Copy Number
Increase inDUF1220
Copy Number
Evolutionary Advantage(Increase in Brain Size?)
Evolutionary Advantage(Increase in Brain Size?)
Proposed Mechanism Linking DUF1220, Brain Evolution and
Disease
Proposed Mechanism Linking DUF1220, Brain Evolution and
Disease1q21.1 duplications
Macrocephaly; Autism*
1q21.1 duplications
Macrocephaly; Autism*
1q21.1 deletions
Microcephaly; Schizophrenia*
1q21.1 deletions
Microcephaly; Schizophrenia*
*Diseases proposed as “Diametric Opposites” (including brain size), Crespi, Stead & Elliot, PNAS, 2009
*Diseases proposed as “Diametric Opposites” (including brain size), Crespi, Stead & Elliot, PNAS, 2009
DUF1220 Model*DUF1220 Model*
DUF1220 model proposes that: 1) DUF1220 copy number is directly involved
in influencing human brain size, and2) the evolutionary advantage of rapidly
increasing DUF1220 copy number in the human lineage has resulted in favoring retention of the high genomic instability of the 1q21.1 region which, in turn, has precipitated a spectrum of recurrent human brain and developmental disorders
DUF1220 model proposes that: 1) DUF1220 copy number is directly involved
in influencing human brain size, and2) the evolutionary advantage of rapidly
increasing DUF1220 copy number in the human lineage has resulted in favoring retention of the high genomic instability of the 1q21.1 region which, in turn, has precipitated a spectrum of recurrent human brain and developmental disorders
*Dumas & Sikela, Cold Spring Harbor Symposium Quant. Biol., 2009*Dumas & Sikela, Cold Spring Harbor Symposium Quant. Biol., 2009
Concluding ThoughtsConcluding Thoughts• DUF1220 domains shows the largest HLS DUF1220 domains shows the largest HLS
protein coding copy number increase in the protein coding copy number increase in the genome genome – But no one gene made us humanBut no one gene made us human– DUF1220 genotyping challengesDUF1220 genotyping challenges
• We know more about our genome than everWe know more about our genome than ever– But there are vast areas of our genome But there are vast areas of our genome
about which we know virtually nothingabout which we know virtually nothing– No mammalian genome has been No mammalian genome has been
completely sequencedcompletely sequenced
AcknowledgementsAcknowledgements• Sikela Lab• Laura Dumas • Majesta O’Bleness• Maggie Popesco• Erik MacLaren• Andy Fortna • Jan Hopkins• Jonathon Keeney• Jack Davis• Jay Jackson• Megan Sikela• Michael Cox• Kriste Marshall• Matt Brenton• Sonya Burgers• Raquel Hink• Erin Dorning• Park McNair
• Sikela Lab• Laura Dumas • Majesta O’Bleness• Maggie Popesco• Erik MacLaren• Andy Fortna • Jan Hopkins• Jonathon Keeney• Jack Davis• Jay Jackson• Megan Sikela• Michael Cox• Kriste Marshall• Matt Brenton• Sonya Burgers• Raquel Hink• Erin Dorning• Park McNair
• Collaborators• Stanford
– Jon Pollack– Young Kim
• Univ. of Kansas - Gerald Wyckoff
• Univ of Utah– Lynn Jorde
• Baylor College– Pawel Stankiewicz– Sau Wai Cheng
• UCSOM– Epidemiology
• Tasha Fingerlin– Preventive Medicine &
Biometrics• Anis Karimpour-Fard
– Neuroscience Program• Rock Levinson• John Caldwell
• Collaborators• Stanford
– Jon Pollack– Young Kim
• Univ. of Kansas - Gerald Wyckoff
• Univ of Utah– Lynn Jorde
• Baylor College– Pawel Stankiewicz– Sau Wai Cheng
• UCSOM– Epidemiology
• Tasha Fingerlin– Preventive Medicine &
Biometrics• Anis Karimpour-Fard
– Neuroscience Program• Rock Levinson• John Caldwell
A Walk Through Our GenomeA Walk Through Our Genome
--All regions of the genome are not created --All regions of the genome are not created equalequal--All regions of the genome are not created --All regions of the genome are not created equalequal
top related