Genes
Dec 20, 2015
Genes
Outline
Genes: definitions Molecular genetics - methodology Genome Content Molecular structure of mRNA-coding genes Genetics Gene regulation
GeneticsMolecular biology
Arrays IssuesGenetic and misexpression approaches
Gene Definitions
Gene Molecular definition: stretch of DNA that encodes:
Functional RNAs - tRNA, rRNAFunctional proteins - mRNA
All sequences necessary for proper function (genetic) – includes regulatory elements and transcription unit
Generally excludes other types of genomic sequencesCentromeres, telomeres, origins of DNA replication,
transposons
Genetic definition: element required for proper organismal function
Molecular Genetics
Genetics [mutant phenotype]
Molecular Biology [gene: sequence, expression-arrays]
Biochemistry [activities, interactions]
Cell biology [structure, dynamics]
Genomic Content
Calf Thymus DNA sheared to a size of ~300 bp, denatured, and reannealed
3 classes:Highly repetitive – 10% DNA - anneals very rapidlyMiddle repetitive – 30% DNA - C0t1/2 = 0.04
Non-repetitive (unique) – 60% DNA - C0t1/2 = 4000
Highly Repetitive Simple Sequence DNA
Clusters of tandemly-linked 5-10 bp repeats Can have > 106 copies/genome Not transcribed
Drosophila virilissatellite DNAs> 95% each satelliteconsists of predominantsequence
Intermediately Repetitive DNA – Mobile Elements
Repetitive elements interspersed among unique DNA
Most are transposons – mobile DNA Many are no longer able to transpose Dispersed throughout the genome Different classes Transpose as DNA or RNA intermediates
Unique DNA Repeat
Unique DNA-Coding Sequence Genes
Slow kinetic class corresponds mainly to protein-coding genes
Average gene size (transcribed region only)/organismE. coli 1.2 kbYeast 1.7 kbDrosophila 11.3 kbHuman 27.0 kb
As complexity increases, so does gene size
Overview of Gene Expression-1
Regulatoryregion
Transcription unit
DNA > ACGT
RNA > ACGU
Transport tocytoplasm
Nucleus
Overview of Gene Expression-2
aa1 = methionine
Protein - myoglobin
DNA and Clones
Genomic or chromosomal DNA – genomic clones (exons, introns, spacer, etc.)
Transcription unit – entire region of gene transcribed (exons + introns)
mRNA – cDNA clones (exonic sequences)ESTs – expressed sequence tags
Oligonucleotides – small stretches of DNA (~20-50 nt)
Human Genome Project: Gene Number
Size 3,200 Mb Predicted gene number
Celera – 39,114 Public consortium – 29,691 Refseq (known genes) – 11,015
Non-identity ~64% novel genes don’t overlap > 80% novel genes expressed
Indicates they are real Estimate ~50,000 genes
Estimate ~ 64 kb/gene Transcribed region = 27 kb Spacer DNA = 37 kb
Repeats + control elements
Human – large number of transcripts/gene exist because of alternative splicing
Completed Genomic Sequencing Projects
Human – disease genes Drosophila – model system for animal development
and gene control Strength - genetics
Nematode - model system for development and behavior Strength - genetics Fly and human more related than worm-human
Arabidopsis – weed: model plant genetic system Crop plants – rice, maize Yeast – typical eukaryotic cell E. coli Many pathogenic bacteria - disease
Genome Projects in Progress
Multiple Humans – SNPs : disease genes and predispositions
Mouse – model system to study human/mammalian gene functionStrength – knockout mutants
Zebrafish - model vertebrate genetic systemStrength – large-scale genetic screens
Crop plants – poplar, apple, tomato + pests Additional Drosophila species
Identify gene control regions
Drosophila
Drosophila genome = 180 MbSequenced 120 Mb euchromatic region60 Mb heterochromatic region unsequenced (few
genes)
Annotation – 13,601 predicted genesGenie – predicts ORFs/exonsCompare to Expressed Sequence Tags (ESTs-cDNAs)Blast searches – sequence identity to known genes
Complications of Gene Prediction by Computer: Cranky Example
RT-PCR of embryonic RNA
33 kb
Exons
Genie CG12561 1-3CG14554
CG14553 1-3CG14552 1-4
EST-LP05454-3'
EST-LP05454-5'6 7 8 9542 31
Drosophila Gene Functions
14,113 predicted transcripts with different coding sequences
Biochemical function Process2,081 Transcription factors 2,274 Metabolism2,422 Enzymes 530 Cell communication665 Transporters 486 Development622 Signal transduction 201 Physiology303 Structural proteins 118 Sensation &
behavior216 Cell adhesion 8,884 Unknown7,576 Unknown