1. Bacterial genomes - genes tightly packed, no introns... HOW TO FIND GENES WITHIN A DNA SEQUENCE? Scan for ORFs (open reading frames) - check all 6 reading frames (both strands) look for significant distance between potential start nd stop codon (eg 100 codons) Fig. 5.2 … but when examining short sequences, start codon (or stop codon) might be located further upstream (or downstream)
18
Embed
1. Bacterial genomes - genes tightly packed, no introns...
HOW TO FIND GENES WITHIN A DNA SEQUENCE?. 1. Bacterial genomes - genes tightly packed, no introns. Scan for ORFs (open reading frames). check all 6 reading frames (both strands). Fig. 5.2. look for significant distance between potential start and stop codon (eg 100 codons). - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1. Bacterial genomes- genes tightly packed, no introns...
HOW TO FIND GENES WITHIN A DNA SEQUENCE?
Scan for ORFs (open reading frames)
- check all 6 reading frames (both strands)
- look for significant distance between potential startand stop codon (eg 100 codons)
Fig. 5.2
… but when examining short sequences, start codon (or stop codon) might be located further upstream (or downstream)
- if initiation codon other than ATG (relatively rare)- if overlapping genes (rare)
Potential problems?
- if gene contains intron(s)
Use computer programs to search for ORFs:
Query: 3 kb sequence
- if deviation from standard genetic code (can change default)
2. Eukaryotic genomes (such as human) - genes usually far apart, long introns & short exons
Fig.5.4
Would an ORF scan work here?
Can also use algorithms to look for:
1. Exon-intron boundaries- “GT-AG” rule, but consensus sequences very short
2. Regulatory motifs - upstream promoters, downstream polyA addition signals…- but consensus sequences usually very short
3. Codon bias patterns- synonymous codons are
not all used equally- patterns differ among
organisms
Table 5.1, Brown1st ed
(see Fig.5.5)
See Fig. 5.10 which shows results from various bioinformatics tools used to analyze 15 kb of human genome
BLAST searches www.ncbi.nlm.nih.gov/BLAST/Basic Local Alignment Search Tool
- search programs to look for similarity between your sequence of interest (protein or DNA) and entries in global data banks
BLASTP – search at protein level
BLASTN – search at nucleotide level
BLASTX – search nt sequence against protein databases(automatic 6-reading frame conceptual translation)
4. Homologous sequences in databank
tBLASTN – protein query vs. conceptual translation of DNA database
Query = yeast mitochondrialribosomal protein L8 (238 aa)
Fungal
Bacterial
E-values: statistical measure of likelihood that sequences with this degree of similarity occur randomly
ie. reflects number of hits expected by chance
Nomenclature may differ among organisms - called L17 in Streptococcus but L8 in yeast