Ch2. Genome Ch2. Genome Organization and Organization and Evolution (continue) Evolution (continue) 阮阮阮 阮阮阮 Jan02, 2003 Jan02, 2003 NTUST NTUST
Ch2. Genome Organization Ch2. Genome Organization and Evolution (continue)and Evolution (continue)
阮雪芬阮雪芬Jan02, 2003Jan02, 2003
NTUSTNTUST
Pick out Genes in GenomesPick out Genes in Genomes
• Open reading frames (ORFs)– Start codon------------------stop codon– A potential protein-coding region
• Approaches to identify protein-coding regions– Detection of regions similar to known coding regions f
rom other organisms– Ab inition methods
• It is more complete and accurate for bacteria than eukaryotes
Pick out Genes in GenomesPick out Genes in Genomes
• A framework for ab initio gene identification in eukaryotic genomes
Genomes of ProkaryotesGenomes of Prokaryotes
• Most prokaryotic cells contain – A large single circular piece of double-
stranded DNA (< 5 Mb)– Plasmids
• E. coli only ~11% of the DNA is non-coding.
The Genome of the Bacterium The Genome of the Bacterium E. E. colicoli
• Strain K-12 contains 4639221 bp in a single circular DNA molecules, with no plastids.
• An inventory reveals– 4285 protein-coding genes– 122 structural RNA genes– Non-coding repeat sequences– Regulatory elements– Transcription/translation guides– Transposase– Prophage remnants– Insertion sequence elements– Patches of unusual composition
大腸桿菌
The Genome of the Bacterium The Genome of the Bacterium E. E. colicoli
• The average size of an ORF is 317 amino acids.
• 630-700 operons, operons vary in size, although few contain more than five genes. Genes within operons vary to have related functions.
The Genome of the Bacterium The Genome of the Bacterium E. E. colicoli
• Several features of E. coli– It can synthesize all components of proteins
and nucleic acids, and cofactors.– It has metabolic flexibility– A wide range of transporters– Even for specific metabolic reactions there
are many cases of multiple enzymes.– Does not posses a complete range of
enzymatic capacity.
The genome of the archaeon The genome of the archaeon MethMethanococcus jannnaschiianococcus jannnaschii
• Methanococcus jannnaschii was collected from a hydrothermal vent 2600m deep off the coast of Baja California, Mexico, in 1983.
• Thermophilic organism• The genome was sequenced in 1996 by T
he Institute for Genomic Research (TIGR). It was the first archaeal genome sequenced.
古甲烷球菌
The genome of the archaeon The genome of the archaeon MethMethanococcus jannnaschiianococcus jannnaschii
• It contains a large chromosome containing a circular double-stranded DNA molecule 1664976 bp long.
• 1743 predicted coding regions.• Some RNA genes contain introns.• As in other prokaryotic genomes there is a little n
on-coding DNA.• In archaea, protein involved in transcription, tran
slation, and regulation are more similar to those of eukaryotes.
• Archaeal proteins involved in metabolism are more similar to those of bacteria.
The genome of one of the simplest The genome of one of the simplest organisms: organisms: Mycoplasma genitaliumMycoplasma genitalium• An infectious bacterium.• Its genome was sequenced in 1995 by TIGR, Th
e Johns Hopkins University and The University of North Carolina.
• The gene repertoire includes some that encode proteins– DNA replication– Transcription– Translation– Adhesions– Other molecules for defence against the host’s immun
e system.– Transport proteins
黴漿菌
Genomes of EukaryotesGenomes of Eukaryotes
• In eukaryotic cells, the majority of DNA is in the nucleus, separated into bundles of nucleoproteins, the chromosomes.
• Each chromosome contains a single double-stranded DNA molecule.
• Nuclear genomes of different species vary widely in size.
• Eukaryotic species vary in the number of chromosomes and distribution of genes among them.– Human chromosome 2~~a fusion of chimpanzee
chromosomes 12 and 13.
Genomes of EukaryotesGenomes of Eukaryotes
• Saccaromyces cerevisiae (Ibaker’s yeast)– Protein-protein interaction
• Yeast two-hybrid system
Yeast Two-hybrid SystemYeast Two-hybrid System
• Useful in the study of various interactions• The technology was originally developed during
the late 1980's in the laboratory Dr. Stanley Fields (see Fields and Song, 1989, Nature).
Yeast Two-hybrid SystemYeast Two-hybrid System
GAL4 DNA-binding
domain
GAL4 DNA-activation domain
Nature, 2000
Yeast Two-hybrid SystemYeast Two-hybrid System
• Library-based yeast two-hybrid screening method
Nature, 2000
Protein-protein Interactions on Protein-protein Interactions on the Webthe Web
• Yeast http://depts.washington.edu/sfields/yplm/data/index.html
http://portal.curagen.com
http://mips.gsf.de/proj/yeast/CYGD/interaction/
http://www.pnas.org/cgi/content/full/97/3/1143/DC1
http://dip.doe-mbi.ucla.edu/
http://genome.c.kanazawa-u.ac.jp/Y2H
• C. Elegans http://cancerbiology.dfci.harvard.edu/cancerbiology/ResLabs/Vidal/
• H. Pylori
http://pim/hybrigenics.com
• Drosophila
http://gifts.univ-mrs.fr/FlyNets/Flynets_home_page.html
Yeast Protein Linkage Map Yeast Protein Linkage Map DataData
• New protein-protein interactions in yeast
Stanley Fields Lab http://depts.washington.edu/sfields/yplm/data
List of interactions with links to YPD
Genomes of EukaryotesGenomes of Eukaryotes
• Caenorhabditis elegans– The genome was completed in 1998– The first full DNA sequence of a multicellular o
rganism– XX genotype: a self-fertilizing hermaphrodite.– XO genotype: a male.
Genomes of EukaryotesGenomes of Eukaryotes
• Drosophila melanogaster– Its genome sequence was announced in 1999 by a co
llaboration between Celera Genomics and the Berkeley Drosophila Genome Project.
– Despite the fact that insects are not very closely related to mammals, the fly genome is useful in the study of human disease.
– It contains homolgues of 289 human genes implicated in various disease:
• Cancer• Cardiovascular disease….etc.
Genomes of Eukaryotes-Genomes of Eukaryotes-HumanHuman
– In Feb 2001, the International Human Genome Sequencing Consortium and Celera Genomics published, separately, drafts of the human genome.
– 22 chromosome pairs +X, Y – Protein coding gene
• ~32000 genes in all
Genomes of Eukaryotes-Genomes of Eukaryotes-HumanHuman
– Nucleic acid binding– Transcription factor binding– Cell cycle regulator– Chaperone– Motor– Actin binding– Defense/immunity protein– Enzyme– Enzyme activator– Enzyme inhibitor
– Apoptosis– Signal transduction– Storage protein– Cell adhesion– Structural protein– Transporter– Ligand binding or carrier– Tumour suppressor– Unclassified
•Human protein coding gene
Genomes of Eukaryotes-Genomes of Eukaryotes-HumanHuman
• Repeat sequences– 50% of the genome– Contain
• Transposable elements• Retroposed pseudogenes• Simple “sutters”• Segmental duplications• Blocks of tandem repeats
Genomes of Eukaryotes-Genomes of Eukaryotes-HumanHuman
• RNA– 497 transfer RNA genes– Genes for 28S and 5.8S ribosomal RNAs– Small nucleolar RNAs– Spliceosomal snRNAs
SNPsSNPs
• Single-nucleotide polymorphisms (SNPs)– A genetic variation between individuals, limite
d to a single base pair which can be substituted, inserted or deleted.
– Sickle-cell anaemia is an example of a disease caused by a specific SNP
• AT mutation in the beta-globin gene changes a GluVal
SNPsSNPs
• Single-nucleotide polymorphisms (SNPs)– Nearly 1.8 million SNPs – Occurring on the average every 2000 base pa
irs.– Not all SNPs are linked to disease– The A, B, and O alleles of genes for blood gro
ups illustrate these possibilities.• A and B alleles differ by four SNP substitutions.
ABO Blood GroupsABO Blood Groups
The human ABO blood groups illustrate the effect of glycosyl-transferases.
N-acetylgalactosamine Galactose
Evolution of GenomesEvolution of Genomes
• Synonymous nucleotide substitution
• Non- synonymous nucleotide substitution Ka = the number of non- synonymous
nucleotide substitution
Ks = the number of synonymous nucleotide substitution
Ka/ Ks : high ratio
possibly functional changes
ExampleExample- The Effect of RGD Mimetic - The Effect of RGD Mimetic Peptide in Breast Cancer Cell Line Peptide in Breast Cancer Cell Line
MCF7MCF7
IntroductionIntroduction
•RGD has been used as inhibitor of integrin-ligand interaction.•Loss of integrin-mediated signaling will induce apoptosis.
Control Aggregation Cell Death
RGD(Arg-Gly-Asp) is the smallest motif that bind with the integrin receptor on the cell surface and Play important role in cell cycle.
IntroductionIntroduction
The Structures of RGD Mimetic The Structures of RGD Mimetic PeptidesPeptides
Asp
GlyArg
NH
H2N O
O
N
O
HN
NH
O
O
OH
HN
O
HN
O
S
S
HN
O
NH
NH
H2N
ArgGly Asp
Trp
Pro
Cys
Tpa
Cyclic-RGD
Apoptosis Apoptosis
• Total 34 genes, but after filtering there are only 19 genes• Total 11 genes have expression fold >2 (up or down
changes)
Apoptosis RegulatorApoptosis Regulator
U60519
U97075
AF051941
U13738
AF005775
U60521
Z48810
AAF19819
U67319
U28976
AF015450
DescriptionGenebankaccession
No.
6 hFold Change
24 hFold Change
48 hFold Change
72 hFold Change
Group 1
caspase 10, apoptosis-related cysteine protease U60519 - - - 0.471
CASP8 and FADD-like apoptosis regulator U97075 - - - 0.355
nucleoside diphosphate kinase type 6 (inhibitorof p53-induced apoptosis-alpha) AF051941 - - - 0.376
Group 2
caspase 3, apoptosis-related cysteine protease U13738 - 2.301 - -
CASP8 and FADD-like apoptosis regulator AF005775 - 2.272 - -
Group 3
caspase 9, apoptosis-related cysteine protease U60521 - - 2.519 -
Group 4
caspase 4, apoptosis-related cysteine protease Z48810 2.615 - 2.796 2.819
Group 5
inhibitor of apoptosis protein AAF19819 - - - 5.249
caspase 7, apoptosis-related cysteine protease U67319 - - - 2.19
caspase 4, apoptosis-related cysteine protease U28976 - - - 2.603
Group 6
CASP8 and FADD-like apoptosis regulator AF015450 - - - 6.912
Apoptosis RegulatorApoptosis Regulator
6 7224 48
time (hour)0.01
0.1
1
10
Normalized Intensity(log scale)
p1
6 7224 48
time (hour)0.01
0.1
1
10
Normalized Intensity(log scale)
p1
6 7224 48
time (hour)0.01
0.1
1
10
Normalized Intensity(log scale)
p1
6 7224 48
time (hour)0.01
0.1
1
10
Normalized Intensity(log scale)
p1
6 7224 48
time (hour)0.01
0.1
1
10
Normalized Intensity(log scale)
p1
6 7224 48
time (hour)0.01
0.1
1
10
Normalized Intensity(log scale)
p1
Caspase Pathway in Caspase Pathway in CCRGD-treRGD-treated MCF7 Cellated MCF7 Cell
Caspase 10
Caspase 9 Caspase 8 and FADD Caspase 4
Caspase 7
Caspase 3
Searching and Clustering of Searching and Clustering of RGD-containing Protein in RGD-containing Protein in
Swiss-Prot DatabaseSwiss-Prot Database• In Swiss-Prot database, there are 541 human
RGD-containing protein containing 5 caspase proteins.
• Caspase 8 was clustered with integrin beta4• Caspase 1, caspase 2, caspase 3 and caspase
7 are clustered.
Please pass the genes: horizontal Please pass the genes: horizontal gene transfer gene transfer
• Horizontal gene transfer is the acquisition of genetic material by one organism from the other.– Direct uptake– Via a viral carrier