Engineering Systems Biology Engineering Systems Biology Lots of Questions... Lots of Questions... Bahrad A. Sokhansanj, Bahrad A. Sokhansanj, PhD PhD Molecular Health Engineering Laboratory Molecular Health Engineering Laboratory School of Biomedical Engineering, Science & School of Biomedical Engineering, Science & Health Systems, Drexel University Health Systems, Drexel University ECE690 Biological Signal Processing II ECE690 Biological Signal Processing II March 31, 2008 March 31, 2008
40
Embed
Engineering Systems Biology Lots of Questions...03/31/2008 ECE690 Bio-Signal Processing (Sokhansanj) 23 Many Ways to Identify Signatures • Identifying major “components” of variation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Engineering Systems BiologyEngineering Systems BiologyLots of Questions...Lots of Questions...
Bahrad A. Sokhansanj,Bahrad A. Sokhansanj, PhDPhDMolecular Health Engineering LaboratoryMolecular Health Engineering Laboratory
School of Biomedical Engineering, Science &School of Biomedical Engineering, Science &Health Systems, Drexel UniversityHealth Systems, Drexel University
ECE690 Biological Signal Processing IIECE690 Biological Signal Processing IIMarch 31, 2008March 31, 2008
• Identifying major “components” of variation (potentiallysomething that has to be removed from data, such as afundamental difference between sampled groups)– singular value decomposition (SVD), principle component
• Identifying major “components” of variation (potentiallysomething that has to be removed from data, such as afundamental difference between sampled groups)– singular value decomposition (SVD), principle component
analysis (PCA), etc.• Finding groups within data
– clustering– self-organizing maps– support vector machines
• Identifying major “components” of variation (potentiallysomething that has to be removed from data, such as afundamental difference between sampled groups)– singular value decomposition (SVD), principle component
analysis (PCA), etc.• Finding groups within data
– clustering– self-organizing maps– support vector machines
• Separating known groups– univariate methods (i.e. B-Tests, T-Tests on each gene,
ANOVA on each gene)– horrible “capitalization on chance” problems– linear discriminant analysis / canonical variate analysis
• these methods can be generalized for undetermined data, thoughthe relative magnitudes of variables becomes significant in thatcase (but that filters out potentially noisy data) OR you getcapitalization by chance by using stepwise methods
Table 2 Amino acid substitution variants identified in DNA repair and repair-related genes
Gene
name Exon Codon
Common
residue
Variant
residue
Allele
frequenc
y
Mouse
residue cDNA sequence 5'3'
APE1 3 51 Gln His 0.03 Gln GAT CA(G/C) AAAAC
APE1 3 64 Ile Val 0.01 Ile TCAAG (A/G)TC TGC
APE1 5 148 Asp Glu 0.33 Glu GGC GA(T/G)GAGGA
APE1 5 241 Gly Arg 0.01 Gly GCTTC (G/A)GGGAA
FEN1 No
variantsLIG1 3 24 Ala Val 0.01 Thr GGAG G(C/T)A TCCA
LIG1 4 62 Arg Trp 0.01 Gln CGGCC (C/T)GG GTC
LIG1 9 249 Gly Glu 0.01 Gly GCCA G(G/A)GGCTC
LIG1 10 267 Asn Ser 0.02 Asn TTAC A(A/G)TCCTG
LIG1 13 369 Val Ile 0.01 Ile AGTCC (G/A)TC CGG
LIG1 13 409 Arg His 0.01 Cys GTTC C(G/A)C GACA
LIG1 16 480 Met Val 0.01 Val CAGCC (A/G)TG GTG
LIG1 20 614 Thr Ile 0.01 Thr GGTC A(C/T)A TCCT
LIG1 22 673 Glu Asp 0.01 Gln CGT GA(G/T)CCCCT
LIG1 22 677 Arg Leu 0.01 Arg TTCC C(G/T)G CGCC
LIG3 18 780 Arg His 0.03 Cys GTCC C(G/A)C AAGG
LIG3 19 811 Lys Thr 0.01 Lys TGCA A(A/C)GCCTT
LIG3 21 899 Pro Ser 0.01 Thr AGAAC (C/T)CT GCG
POLB 1 8 Gln Arg 0.01 Gln GCCG C(A/G)G GAGA
POLB 7 137 Arg Gln 0.006 Arg TCAG C(G/A)AATTG
POLB 12 242 Pro Arg 0.005 Pro GCTT C(C/G)C AGTA
POLD1 1 19 Arg His 0.12 Arg GGCC C(G/A)T GGGG
POLD1 1 30 Arg Trp 0.006 Ser CACCT (C/T)GG CCA
POLD1 3 119 Arg His 0.15 Arg ATCC C(G/A)C GGCT
POLD1 4 173 Ser Asn 0.05 Ser CATC A(G/A)CCGGG
POLD1 4 177 Arg His 0.003 Arg CAGT C(G/A)CGGGG
POLD1 19 849 Arg His 0.011 Arg ACTG C(G/A)CCGCC
POLD1 26 1086 Arg Gln 0.01 Arg GGTG C(G/A)GAAGG
Mohrenweiser HW, Xi T, Vazquez-Matias J, Jones IM. Identification of 127 amino acid substitution variants in screening 37 dna repair genes in humans. Cancer Epidemiol. Biomarkers & Prev., 11: 1054-1064, 2002.
03/31/2008 ECE690 Bio-Signal Processing (Sokhansanj) 37Hadi, M. Z., Coleman, M. A., Fidelis, K., Mohrenweiser, H. W. and Wilson, D. M. III, Nucleic Acids Res., 28, 3871-3879, 2000.
• Differential equations for eachenzymatic activity: kcat, KM andprotein concentrations taken fromexperimental data• Based on physical measurementsof cell: assume well-mixed proteins,but kcat/KM for slowed diffusion in thenucleus• Model is consistent withexperimental mechanistic data (i.e.predominance of short patch BERand coordination between proteins)
Percentage increase in Steady State Damage (for continuous formation of damage) and Repair Time (foran initial amount of damage to be cleared) given sub-functional, potentially non-lethal variants