Page 1
• Review of important points from the NCBI lectures.– Example slides
• Review the two types of microarray platforms.– Spotted arrays– Affymetrix
• Specific examples that use microarray technology.– Gene expression - role of a transcription factor
Page 2
Web Access
BLAST
VAST
Entrez
Text
Sequence
Structure
Page 3
Translated BLAST
Query DatabaseProgram
N Pucleotide rotein
N
N
N
N
P
P
blastx
tblastn
tblastx
PPPP P P
PPPP P P PPPP P P
PPPP P PParticularly useful for nucleotide sequences withoutprotein annotations, such as ESTs or genomic DNA
Page 4
Position Specific Score Matrix (PSSM)
A R N D C Q E G H I L K M F P S T W Y V 206 D 0 -2 0 2 -4 2 4 -4 -3 -5 -4 0 -2 -6 1 0 -1 -6 -4 -1 207 G -2 -1 0 -2 -4 -3 -3 6 -4 -5 -5 0 -2 -3 -2 -2 -1 0 -6 -5 208 V -1 1 -3 -3 -5 -1 -2 6 -1 -4 -5 1 -5 -6 -4 0 -2 -6 -4 -2 209 I -3 3 -3 -4 -6 0 -1 -4 -1 2 -4 6 -2 -5 -5 -3 0 -1 -4 0 210 S -2 -5 0 8 -5 -3 -2 -1 -4 -7 -6 -4 -6 -7 -5 1 -3 -7 -5 -6 211 S 4 -4 -4 -4 -4 -1 -4 -2 -3 -3 -5 -4 -4 -5 -1 4 3 -6 -5 -3 212 C -4 -7 -6 -7 12 -7 -7 -5 -6 -5 -5 -7 -5 0 -7 -4 -4 -5 0 -4 213 N -2 0 2 -1 -6 7 0 -2 0 -6 -4 2 0 -2 -5 -1 -3 -3 -4 -3 214 G -2 -3 -3 -4 -4 -4 -5 7 -4 -7 -7 -5 -4 -4 -6 -3 -5 -6 -6 -6 215 D -5 -5 -2 9 -7 -4 -1 -5 -5 -7 -7 -4 -7 -7 -5 -4 -4 -8 -7 -7 216 S -2 -4 -2 -4 -4 -3 -3 -3 -4 -6 -6 -3 -5 -6 -4 7 -2 -6 -5 -5 217 G -3 -6 -4 -5 -6 -5 -6 8 -6 -8 -7 -5 -6 -7 -6 -4 -5 -6 -7 -7 218 G -3 -6 -4 -5 -6 -5 -6 8 -6 -7 -7 -5 -6 -7 -6 -2 -4 -6 -7 -7 219 P -2 -6 -6 -5 -6 -5 -5 -6 -6 -6 -7 -4 -6 -7 9 -4 -4 -7 -7 -6 220 L -4 -6 -7 -7 -5 -5 -6 -7 0 -1 6 -6 1 0 -6 -6 -5 -5 -4 0 221 N -1 -6 0 -6 -4 -4 -6 -6 -1 3 0 -5 4 -3 -6 -2 -1 -6 -1 6 222 C 0 -4 -5 -5 10 -2 -5 -5 1 -1 -1 -5 0 -1 -4 -1 0 -5 0 0 223 Q 0 1 4 2 -5 2 0 0 0 -4 -2 1 0 0 0 -1 -1 -3 -3 -4 224 A -1 -1 1 3 -4 -1 1 4 -3 -4 -3 -1 -2 -2 -3 0 -2 -2 -2 -3
Serine is scored differently in these two positions
Active site nucleophile
Page 5
PSI-BLAST
Create your own PSSM:
Confirming relationships of purine
nucleotide metabolism proteins
query BLOSUM62PSSM AlignmentAlignment
Page 6
Affymetrix vs. glass slide based arrays
• Affymetrix• Short oligonucleotides• Many oligos per gene• Single sample
hybridized to chip
• Glass slide• Long oligonucleotides
or PCR products• A single oligo or PCR
product per gene• Two samples
hybridized to chip
Page 7
Bacterial DNA microarrays
• Small genome size
• Fully sequenced genomes, well annotated
• Ease of producing biological replicates
• Genetics
Page 8
Applications of DNA microarrays
• Monitor gene expression– Study regulatory networks– Drug discovery - mechanism of action– Diagnostics - tumor diagnosis – etc.
• Genomic DNA hybridizations– Explore microbial diversity– Whole genome comparisons– Diagnostics - tumor diagnosis
• ?
Page 9
Characterization of the stationary phase sigma factor regulon (H)
in Bacillus subtilis
• Robert A. Britton and Alan D. Grossman - Massachusetts Institute of Technology.
• Patrick Eichenberger, Eduardo Gonzalez-Pastor, and Richard Losick - Harvard University.
Page 10
What is a sigma factor?
• Directs RNA polymerase to promoter sequences
• Bacteria use many sigma factors to turn on regulatory networks at different times.– Sporulation– Stress responses– Virulence
Wosten, 1998
Page 11
Alternative sigma factors in B. subtilis sporulation
Kroos and Yu, 2000
Page 12
The stationary phase sigma factor: H
most active at the transition from exponential growth to stationary phase
mutants are blocked at stage 0 of sporulation
• known targets involved in:
phosphorelay (kinA, spo0F) sporulation (sigF, spoVG) cell division (ftsAZ) cell wall (dacC) general metabolism (citG) phosphatase inhibitors (phr peptides)
Page 13
Experimental approach• Compare expression profiles of wt and
∆sigH mutant at times when sigH is active. • Artificially induce the expression of sigH
during exponential growth.– When Sigma-H is normally not active.– Might miss genes that depend additional factors
other than Sigma-H.
• Identify potential promoters using computer searches.
s i g H
P s p a c
Page 14
Grow cells
Isolate RNAMake labeled cDNA
Mix and hybridize
Scan slideAnalyze data
∆sigH wild-type
Page 15
Hour -1 Hour 0 Hour +1
wild type (Cy5) vs. sigH mutant (Cy3)
citGsacT
Page 17
Identifying differentially expressed genes
• Many different methods
• Arbritrary assignment of fold change is not a valid approach
• Statistical representation of the data– Iterative outlier analysis– SAM (significance analysis of microarrays)
Page 18
Data from a microarray are expressed as ratios
• Cy3/Cy5 or Cy5/Cy3
• Measuring differences in two samples, not absolute expression levels
• Ratios are often log2 transformed before analysis
Page 19
Genes whose transcription is influenced by H
• 433 genes were altered when comparing wt vs. ∆sigH.
• 160 genes were altered when sigH overexpressed.
• Which genes are directly regulated by Sigma-H?
Page 20
Identifying sigH promoters
• Two bioinformatics approaches– Hidden Markov Model database (P. Fawcett)
• HMMER 2.2 (hmm.wustl.edu)
– Pattern searches (SubtiList)
• Identify 100s of potential promoters
Page 21
Correlate potential sigH promoters with genes identified
with microarray data.• Genes positively regulated by Sigma-H in a
microarray experiment that have a putative promoter within 500bp of the gene.
Page 22
Directly controlled sigH genes
• 26 new sigH promoters controlling 54 genes• Genes involved in key processes associated with the
transition to stationary phase– generation of new food sources (ie. proteases)– transport of nutrients– cell wall metabolism– cyctochrome biogenesis
• Correctly identified nearly all known sigH promoters• Complete sigH regulon:
– 49 promoters controlling 87 genes.
Page 23
• Identification of DNA regions bound by proteins.
Iyer et al. 2001 Nature, 409:533-538
Page 24
Grow cells
Isolate RNAMake labeled cDNA
Mix and hybridize
Scan slideAnalyze data
Pathogen 1 Pathogen 2