FBW1-12-2015
Wim Van Criekinge
GitHub: Hosted GIT
• Largest open source git hosting site• Public and private options• User-centric rather than project-centric• http://github.ugent.be (use your Ugent
login and password)– Accept invitation from Bioinformatics-I-
2015URI:– https://github.ugent.be/Bioinformatics-I-
2015/Python.git
Control Structures
if condition: statements[elif condition: statements] ...else: statements
while condition: statements
for var in sequence: statements
breakcontinue
Extra Questions (2)
• How many human proteins in Swiss Prot ?• What is the longest human protein ? The shortest ?• Calculate for all human proteins their MW and pI, display as
two histograms (2D scatter ?)• How many human proteins have “cancer” in their description?• Which genes has the highest number of SNPs/somatic
mutations (COSMIC)• How many human DNA-repair enzymes are represented in
Swiss Prot (using description / GO)?• List proteins that only contain alpha-helices based on the
Chou-Fasman algorithm• List proteins based on the number of predicted
transmembrane regions (Kyte-Doollittle)
Primary sequence reveals important clues about a protein
DnaG E. coli ...EPNRLLVVEGYMDVVAL...DnaG S. typ ...EPQRLLVVEGYMDVVAL...DnaG B. subt ...KQERAVLFEGFADVYTA...gp4 T3 ...GGKKIVVTEGEIDMLTV...gp4 T7 ...GGKKIVVTEGEIDALTV...
: *: :: * * : :
small hydrophobiclarge hydrophobicpolarpositive chargenegative charge
• Evolution conserves amino acids that are important to protein structure and function across species. Sequence comparison of multiple “homologs” of a particular protein reveals highly conserved regions that are important for function.
• Clusters of conserved residues are called “motifs” -- motifs carry out a particular function or form a particular structure that is important for the conserved protein.
motif
The hydropathy index of an amino acid is a number representing the hydrophobic or hydrophilic properties of its side-chain.
It was proposed by Jack Kyte and Russell Doolittle in 1982.
The larger the number is, the more hydrophobic the amino acid. The most hydrophobic amino acids are isoleucine (4.5) and valine (4.2). The most hydrophilic ones are arginine (-4.5) and lysine (-3.9).
This is very important in protein structure; hydrophobic amino acids tend to be internal in the protein 3D structure, while hydrophilic amino acids are more commonly found towards the protein surface.
Hydropathy index of amino acids
5-hydroxytryptamine receptor 2A isoform 1 [Homo sapiens]NCBI Reference Sequence: NP_000612.1GenPept Identical Proteins Graphics>gi|10835175|ref|NP_000612.1| 5-hydroxytryptamine receptor 2A isoform 1 [Homo sapiens]MDILCEENTSLSSTTNSLMQLNDDTRLYSNDFNSGEANTSDAFNWTVDSENRTNLSCEGCLSPSCLSLLHLQEKNWSALLTAVVIILTIAGNILVIMAVSLEKKLQNATNYFLMSLAIADMLLGFLVMPVSMLTILYGYRWPLPSKLCAVWIYLDVLFSTASIMHLCAISLDRYVAIQNPIHHSRFNSRTKAFLKIIAVWTISVGISMPIPVFGLQDDSKVFKEGSCLLADDNFVLIGSFVSFFIPLTIMVITYFLTIKSLQKEATLCVSDLGTRAKLASFSFLPQSSLSSEKLFQRSIHREPGSYTGRRTMQSISNEQKACKVLGIVFFLFVVMWCPFFITNIMAVICKESCNEDVIGALLNVFVWIGYLSSAVNPLVYTLFNKTYRSAFSRYIQCQYKENKKPLQLILVNTIPALAYKSSQLQMGQKKNSKQDAKTTDNDCSMVALGKQHSEEASKDNSDGVNEKVSCV
(http://gcat.davidson.edu/DGPB/kd/kyte-doolittle.htm)Kyte Doolittle Hydropathy Plot
Possible transmembrane fragment
Window size – 9, strong negative peaks indicate possible surface regions
Surface region of a protein
Prediction of transmembrane helices in proteins(TMHMM)
5-hydroxytryptamine receptor 2A (Mus musculus)
5-hydroxytryptamine receptor 2 (Grapical output)
Examen.py
http://bioinformatics.biobix.be/examen/
Check availability of PC rooms plus additional dates